MiDaS Depth Estimation

I’ve converted most of the available MiDaS models to onnx and uploaded them on GitHub as part of a package to use them with Sentis in case it’s useful to people.

Asset Store · GitHub · Documentation · OpenUPM · Hugging Face

1 Like

Awesome! The video is amazing. I recognise the Blender bunny!

BTW If you would like more eyes :eye::eye: on this, we are encouraging people to upload their Sentis models to Hugging Face using the library tag Unity Sentis. You can copy the format of our sentis-othello example where we linked to an external github repo. We are preferring if people include the serialized version of the sentis files. (You can add the onnx files too) As you can see we have one version of MiDaS but the more the merrier!

1 Like

Thank you, I will look into uploading to Hugging Face (although tbh I’m still a bit confused about its purpose and what it does that GitHub doesn’t already do)


Speaking of .sentis as a distribution format: Is it really a good idea to distribute those?

My thinking was, that if I serialize some models using a specific version of Sentis, anyone importing that file into a project with an older Sentis version might have trouble if the serialization format changed.

I’m all on board with a runtime format that aims for zero-copy deserialization rather than protobuf deserialization. But the standardization and versioning of .onnx is better suited as the source format imho.

So I think it would be ideal if the AssetDatabase handled this under the hood, where people use .onnx as the source files and Unity caches the imported .sentis model in the Library folder, but as a weak reference, so it doesn’t constantly keep the model in memory unless used.

This is similar to how people share .gltf, .fbx or .obj files, not serialized Unity .asset files.

(The whole StreamingAssets workflow is also a bit clunky and non-standard imho)

2 Likes

You make some valid points.

You can think of Hugging Face as a place where people go to search for models. So although functionally it may not be that different to github, it’s more a case of making things easy to find. (For example people can click on the Unity Sentis tag and see all the models that are validated for unity)

For your other point, the idea is that people will always want to run the latest version of the Sentis package. And that if the Sentis file format changes there would be a way to autoconvert it to the latest version.

For the library caching, there is some drawbacks of this approach for larger files for example if you are dealing with a 10GB model. So in that sense we are dealing with large model files more like how we would deal with a large movie file which you would put in StreamingAssets.

But you are right there’s pros and cons of both methods. Well, hope that explains our ideas behind it. :slightly_smiling_face:

1 Like

@julienkay the thing is a onnx import is complicated and involves a few optimzations passes.
We for sure don’t want to pay the cost of doing that every-time you load a model
We also cannot ship protobuff dll on every platform
This is the same as .obj everytime you import it in your project it re-imports the mesh.
The AssetDatabase does this and keep a serialized .sentis under the hood ofc, it’s just that

  • you have to import the file
  • you keep both the .onnx and the cached .sentis on your pc
    Ofc when you build a game we don’t ship both files :slight_smile:
1 Like

.sentis files do have a version # so we can check if said version matches to the current sentis package version when importing.
But yes generally I do tend to agree with your points

1 Like

Yes, I agree with those points. To be clear, I don’t question the need for .sentis as a runtime format at all.

I guess my complaint has more to do with how the AssetDatabase works in general for these types of usecases. As far as I can tell, the “StreamingAssets workaround” could be something that’s done under the hood for large files without adding friction to the users workflow.

But I get that this is owned by a different team and is probably not an easy thing to change for you in the short term.

Hi, what would be the approach to use the new Depth-Anything model? I downloaded the ONXX model and copied it into the default Barracuda project, but it doesn’t let me choose it, it shows import errors about serialization. Thank you.

Use com.unity.sentis 1.3.0-pre.3 if it doesn’t work, please share the onnx so we can fix import

I see the sample uses OnRenderImage
Why is there no URP implementation? I thought Unity wants BiRP out

We’ll add a URP implementation, thanks for the request

Awesome!
Any ETA for that?

Hey, I’ve updated the project to be URP compatible.

1 Like

Hey @alexandreribard_unity . Just trying to get the latest Depth-Anything ONNX models working but seem to be getting import errors. Am on com.unity.sentis 1.3.0-pre.3.

depth_anything_v2_dinov2_layers_patch_embed_PatchEmbed_pretrained_patch_embed_1 not supported
aten_expand not supported
aten_add not supported
depth_anything_v2_dinov2_layers_block_Block_pretrained_blocks_0_1 not supported
[...]

I do not think this package has the one-layer-per-frame optimization.
Would love to see it for it to be more mobile friendly

1 Like

I agree this would be a useful addition.

So far I’ve held off on adding a method that distributes inference over multiple frames, because it is not easy to figure out the exact number of layers to execute per frame. It depends on the target device and the type of layers. And since the package supports models of various sizes, that also plays into how many you would run each frame. One layer per frame as an example would needlessly prolong the whole process.

The thing I could add would be a method, that lets you specify that number yourself. Something like:

EstimateDepthAsync(Texture t, int numLayersPerFrame)

Also, out of curiosity would you personally prefer to call this method

  • as a coroutine
  • via async/await
  • or get notified through an event / delegate when it’s done

Lastly, I’d also recommend voting for the ‘Automatic time slicing’ idea on the Sentis Roadmap, the goal of which is to provide a way to specify a target framerate, which would be ideal.

Personally I have tried to implement it in a coroutine and in Update (coroutine was better).
For an asset I think calling an Event is a great way to do this, so people just have to add a listener