I have managed to convert most of the code for the whisper example (as downloaded from HuggingFace) but I am having no success with the following code in RunWhisper.cs at about line 80:
I have tried a few things to get this working but with no luck, the Compile function has changed, there is no “Forward” on the model anymore and it doesn’t seem to like the audio input. Learning as I am going along so no expert!
Does anyone have any pointers on how to update this?
var graph = new FunctionalGraph();
var tokens = graph.AddInput(model, 0);
var tokens_audio = Functional.Forward(model, tokens);
var output = Functional.ArgMaX(tokens_audio[0], 2);
var decoderWithArgmax = graph.Compile(output);
Would someone be able to help me with a similar issue of converting sentis-MiniLM-v6 to Sentis 2.0? I’m in the same boat of having been able to convert the rest of the code but I am currently stuck on the two Functional.Compile() calls in the sample script:
var graph = new FunctionalGraph();
var input_ids = graph.AddInput(model, 0);
var attention_mask = graph.AddInput(model, 1);
var token_type_ids = graph.AddInput(model, 2);
var tokenEmbeddings = Functional.Forward(model, input_ids, attention_mask, token_type_ids)[0];
var meanPooling = MeanPooling(tokenEmbeddings, attention_mask);
Model modelWithMeanPooling = graph.Compile(meanPooling);
and
var graph = new FunctionalGraph();
var input1 = graph.AddInput(DataType.Float, new TensorShape(1, FEATURES));
var input2 = graph.AddInput(DataType.Float, new TensorShape(1, FEATURES));
var output = Functional.ReduceSum(input1 * input2, 1);
Model dotScoreModel = graph.Compile(output);
Thanks for the feedback, this is a bug, I will create a ticket and we will try and get a fix in the next update.
The only thing I can think of that might fix this would be to try and make sure the inputs to the model are static shapes.
So do
var input_ids = graph.AddInput(___, new TensorShape(_____));
var attention_mask = graph.AddInput(___, new TensorShape(_____));
var token_type_ids = graph.AddInput(___, new TensorShape(_____));
You can inspect the model in Unity to see the data types and input shapes, and if they have dynamic shapes (they will look like “d0, d1…”) you can write those out with static values that you will feed in with your input tensors.
I can’t 100% guarantee this will fix the issue you have described, as it will depend on your model. But in general if your input tensors have fixed sizes then it’s good practice to do this, as we can optimize the model better in this case.
Let me know if this works. As I say we will try and fix the underlying issue too.