Currently, I am trying to get the midas sentis model working on the newer version of Sentis – as the source code linked through HuggingFace does not work anymore.
I managed to convert the model to *.sentis and recieving output from the model. However, the output is not normalised, so the output values seem to range from 0-(?). So, I would like to normalise these values by adding layers to the output of the midas model. I tried to copy/convert what was done in the example on Sentis 1.3, but I am not recieving any meaningful output (NaNs).
Here is my code:
runtimeModel = ModelLoader.Load(Application.streamingAssetsPath + "/midas.sentis");
var additonalModel = Functional.Compile(
input => {
//Input (1, 256, 256)
var modelOutput = runtimeModel.Forward(input)[0];
var maxVal = FF.ReduceMax(modelOutput, 0, false);
var minVal = FF.ReduceMin(modelOutput, 0, false);
// Subtract minVal from modelOutput (output - minVal)
var normalizedOutput = FF.Sub(modelOutput, minVal);
// Normalize by dividing by (maxVal - minVal)
var range = FF.Sub(maxVal, minVal);
normalizedOutput = FF.Div(normalizedOutput, range);
// Return the normalized output
return (normalizedOutput);
},
InputDef.FromModel(runtimeModel)[0]
);
worker = WorkerFactory.CreateWorker(backend, additonalModel);
Ah I didn’t know you could use regular operations inside of Function.Compile.
However, that doesnt fix my issue – I still recieve a NaN output for every value in the (256, 256) array. I even thought it might just be that the number is so small that is outputting NaNs, but I checked the min and max value in the tensor; both are NaNs.
I think the issue is with the operations I am doing in the additional layers. I have a feeling the last operation FF.Div(normalizedOutput, range) * 255f range is equal to 0, and dividing by 0 outputs the NaN.
Also the raw output from the Midas model ranges from 0-2000~, I could be wrong.
Is ReduceMax suppose to get the maximum value (float in my case) across all dimensions?
It seems that it is not working how it is suppose to be, or I am understanding it wrong? As, if I return the minVal functional tensor as the ouput, I recieve a tensor which is (256, 256) – which shouldnt be the case; should it not just return a singular float?
I treid applying a sigmoid function on the model output there, but I seem to be getting better results with the prior operations I had. I’m only learning this framework at the moment, and I don’t have much experiecne working with ML frameworks – so sorry if I’m asking silly questions…
Would you have any idea why the output is worse using sigmoid?
Sigmoid is a non linear remapping. If you look at the function curve it smooshes values in the tails together.
Minmax remapping is linear. it preserves the values relative range different.
So it’s not worse it just so happens that the depth values are spanning a large range and thus result in similar values after the sigmoid, making our brain hard to notices small differences