Hi Unity team and community
I am trying to find out what the practical difference between using Sentis and onnxruntime is and when one should use one over the other.
- Is Sentis its own runtime implementation?
- Does Sentis do some runtime optimizations that onnxruntime does not?
- Is compatibility of Sentis across different platforms better?
There’s a lot of code and examples for onnxruntime which makes it much easier to use right now. It also makes me wonder what the advantage of Sentis is.
Thanks in advance for any clarifications
- yes we use our own runtime implementation
- yes for some models we have optimized for we are faster than onnxruntime. Overall we aim to be the same
- On anything not cuda onnxruntime won’t work on the GPU. On consoles onnxruntime won’t work. For mobile, onnxruntime has a specific codepath to compile the code for that platform.
Sentis you code it once and can deploy to any platform you want. Same code.
You can test on laptop and be confident that it works everywhere.
And it’s much easier to interface with Unity resources
1 Like
I would ask you, what is easier todo in onnxruntime that isn’t in Sentis
And what are we missing to make your life easier
Hey Alexandre
Thank you for the swift and clarifying response! I see the advantage. Sounds super
It would be great to have some official performance comparisons. I guess you’re already doing that internally, so it would be helpful for developers to know what performance to expect from Sentis along information on what sets its implementation apart from onnxruntime and others.
Actually I haven’t used onnxruntime before either, which is maybe why I had these questions. I come from pytorch/tensorflow and have used OpenVINO before.
Naturally, there are more practical examples for onnxruntime. Onnxruntime also has extensions that include some useful quality of life functionalities like tokenizers. It’s not fair to expect that from a new framework, it’s just something that would be nice to have eventually.
Currently, I am still exploring Sentis’s capabilities and don’t have many concrete suggestions.
There is another concrete question i have though: Am I right to assume that with TextureConverter.RenderToTexture(outputTensor, rt)
, the data is never copied to CPU, but will remain in GPU? I want to access the neural model output data directly in a shader program. If this is how it works, it would likely be a big advantage over other deep learning inference frameworks. As far as I understand, it’s usually not that easy to get model output data directly into a shader program.
Kind regards
Philipp
Yes ofc, the RenderToTexture everything stays on the GPU no download.
You can also pin a tensor to a shader, write from a shader to a tensor and so forth…
You can do the same with custom burst jobs
I’ll refer you to
~Samples/Use a compute buffer
~Samples/Use Burst to write data
1 Like