On performance

I have an image enhancement model. It’s intended to do real time video enhancement. So the performance requirement is very demanding.
The model is trained in pytorch, then exported to onnx, and then tensorRT for the none-Unity version of our app. The inference time is 10ms on a RTX3090, which barely meet the need. Then I tried on Unity Sentis, with the onnx model, one inference time is around 80ms.
Is this performance gap between TensorRT and Sentis normal? Is it possible to optimize sentis model to the same level of TensorRT?

It seems conv op takes most of the time. Is this expected? What can I do to improve?

ouch that seems pretty bad…
That’s the cumulative use of conv layers in the model.
But even so 6x compared to tensorRT is bad, so please share the model we’ll look into optimizing it

Thank you for your response, links below are the onnx model and trt model.