Segmentation Results seem to be shifted when updating from 1.3.0-pre.2 to 1.3.0-pre.3

I have implemented semantic segmentation by DeepLab v3 using Unity Sentis.
It seems to be working fine using Sentis 1.3.0-pre.2.
But, the segmented mask seems to be shifted when updating to Sentis 1.3.0-pre.3.
Specifically, it looks as if it is shifted to the upper left. Is this a bug?

I used a pre-trained model from PyTorch (torchvision.models).
https://pytorch.org/vision/stable/models/generated/torchvision.models.segmentation.deeplabv3_mobilenet_v3_large.html#torchvision.models.segmentation.deeplabv3_mobilenet_v3_large

ONNX is here.

thanks we’ll look into it

This is known internally as Issue 416.

1 Like

I would love to see the code you built to apply the image segmentation masks to these images. Could you share your scripts?

@RossMelbourne This code snippet is the main implementation for inference segmentation model.

/* pre process (convert to tensor from input image) */

// predict
worker.Execute(input_tensor);
var output_tensor = worker.PeekOutput("output") as TensorFloat;

// get mask tensor
var mask_tensor = ops.ArgMax(output_tensor, 1, false);
mask_tensor.MakeReadable();

/* post process (resize to input image size) */

/* visualize mask (colorize by mask indices) */

Thank you for this. My Unity skills are not that great. The post processing and the applying the mask visually is where I need help.

@RossMelbourne We have released an image recognition packages that based on Unity Sentis. It is included an sample app of this semantic segmentation. Please refer to this packages. I think it will be useful for you. Thanks,

1 Like

@alexandreribard_unity @liutaurasvysniauskas_unity
I was try port to Unity Sentis 1.4.0-pre.3 and run this program, but it still does not seem to be resolved. I hope you will continue to work on this issue. Thanks,

1 Like

Yes, we’re still working on it. Sorry for not getting it on this release and thanks for your patience:)

I checked the model inference against the ONNX runtime, and we seem to match their result very closely. I expect the issue is in your tensor pre-processing, post-processing or rendering code rather than in the model inference.

Can you help us validate this and also show your surrounding code?

@gilescoope Yes, my code is published as part of this image recognition packages based Unity Sentis. jp.co.hololab.dnn.segmentation package contains sample app with visualization for semantic segmentation.
https://discussions.unity.com/t/introduction-of-image-recognition-packages-project-that-based-on-sentis/336955

Ok there seem to be a lot of preprocessing and postprocessing steps involved.

The best thing for you to do would be to check the inputs and outputs of the model with the different versions of Sentis to verify whether the change between Sentis 1.3.0-pre.2 and Sentis 1.3.0-pre.3 (or later) is in the model inference itself or in the preprocessing and postprocessing code.