Using Octopus v2 model in unity

I’m running into an error stating that my input does not meet the required size of (1,39) while using the Octopus v2 model in Unity with the Sentis library. Could someone provide guidance on diagnosing and resolving this error?

What’s the input the model is expecting and what is the shape of the input you are feeding it?

I am using text as input, which is first converted to tokens using the vocab.json and merge.txt files. The tokens are then fed to the model. The error says the expected input size is (1, 39).

The screenshot speaks for itself doesn’t it?
You have 39 inputs all of which seems to have a shape of (d0, 1, d3, 256)
You are trying to run the model with a input of shape (1, 39)
Shape clearly doesn’t match…

I think, after tokenizing the text input, it should have this dimension. here is the exact error message:

I am using this code: The model takes 39 inputs and I am giving one input. I am trying to figure out how I can give 39 inputs of (d0, 1, d3, 256) dimension.

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using Unity.Sentis;
using System.IO;
using System.Text;
using FF = Unity.Sentis.Functional;
public class octo: MonoBehaviour
{
    const BackendType backend = BackendType.GPUCompute;

    string outputString = "One day an alien came down from Mars. It saw a chicken";

    const int maxTokens = 100;
    const float predictability = 5f;
    const int END_OF_TEXT = 50256;

    string[] tokens;
    IWorker engine;
    int currentToken = 0;
    int[] outputTokens = new int[maxTokens];
    int[] whiteSpaceCharacters = new int[256];
    int[] encodedCharacters = new int[256];
    bool runInference = false;
    const int stopAfter = 100;
    int totalTokens = 0;
    string[] merges;
    Dictionary<string, int> vocab;
    Model model1;

    void Start()
    {
        SetupWhiteSpaceShifts();
        LoadVocabulary();

        model1 = ModelLoader.Load(Path.Join(Application.streamingAssetsPath, "model.sentis"));
        var modelInput = model1.inputs[0];
        var outputIndex = model1.outputs.Count - 1;

        // Create a new model to select the random token:
        var model2 = FF.Compile(
            (input, currentToken) =>
            {
                var row = FF.Select(model1.Forward(input)[outputIndex], 1, currentToken);
                return FF.Multinomial(predictability * row, 1);
            },
            (modelInput, InputDef.Int(new TensorShape()))
        );

        engine = WorkerFactory.CreateWorker(backend, model2);

        DecodePrompt(outputString);

        runInference = true;
    }

    void Update()
    {
        if (runInference)
        {
            RunInference();
        }
    }
    void RunInference()
    {
        using var tokensSoFar = new TensorInt(new TensorShape(1, maxTokens), outputTokens);

        // Pass the required input to the engine
        engine.Execute(new Dictionary<string, Tensor> { { model1.inputs[0].name, tokensSoFar } });

        var probs = engine.PeekOutput() as TensorInt;
        probs.CompleteOperationsAndDownload();

        int ID = probs[0];

        //shift window down if got to the end
        if (currentToken >= maxTokens - 1)
        {
            for (int i = 0; i < maxTokens - 1; i++) outputTokens[i] = outputTokens[i + 1];
            currentToken--;
        }

        outputTokens[++currentToken] = ID;
        totalTokens++;

        if (ID == END_OF_TEXT || totalTokens >= stopAfter)
        {
            runInference = false;
        }
        else if (ID < 0 || ID >= tokens.Length)
        {
            outputString += " ";
        }
        else outputString += GetUnicodeText(tokens[ID]);

        Debug.Log(outputString);
    }

    void DecodePrompt(string text)
    {
        var inputTokens = GetTokens(text);

        for(int i = 0; i < inputTokens.Count; i++)
        {
            outputTokens[i] = inputTokens[i];
        }
        currentToken = inputTokens.Count - 1;
    }
   
    void LoadVocabulary()
    {
        var jsonText = File.ReadAllText(Path.Join(Application.streamingAssetsPath , "vocab.json"));
        vocab = Newtonsoft.Json.JsonConvert.DeserializeObject<Dictionary<string, int>>(jsonText);
        tokens = new string[vocab.Count];
        foreach (var item in vocab)
        {
            tokens[item.Value] = item.Key;
        }

        merges = File.ReadAllLines(Path.Join(Application.streamingAssetsPath , "merges.txt"));
    }

    // Translates encoded special characters to Unicode
    string GetUnicodeText(string text)
    {
        var bytes = Encoding.GetEncoding("ISO-8859-1").GetBytes(ShiftCharacterDown(text));
        return Encoding.UTF8.GetString(bytes);
    }
    string GetASCIIText(string newText)
    {
        var bytes = Encoding.UTF8.GetBytes(newText);
        return ShiftCharacterUp(Encoding.GetEncoding("ISO-8859-1").GetString(bytes));
    }

    string ShiftCharacterDown(string text)
    {
        string outText = "";
        foreach (char letter in text)
        {
            outText += ((int)letter <= 256) ? letter :
                (char)whiteSpaceCharacters[(int)(letter - 256)];
        }
        return outText;
    }

    string ShiftCharacterUp(string text)
    {
        string outText = "";
        foreach (char letter in text)
        {
            outText += (char)encodedCharacters[(int)letter];
        }
        return outText;
    }

    void SetupWhiteSpaceShifts()
    {
        for (int i = 0, n = 0; i < 256; i++)
        {
            encodedCharacters[i] = i;
            if (IsWhiteSpace(i))
            {
                encodedCharacters[i] = n + 256;
                whiteSpaceCharacters[n++] = i;
            }
        }
    }

    bool IsWhiteSpace(int i)
    {
        //returns true if it is a whitespace character
        return i <= 32 || (i >= 127 && i <= 160) || i == 173;
    }

    List<int> GetTokens(string text)
    {
        text = GetASCIIText(text);

        // Start with a list of single characters
        var inputTokens = new List<string>();
        foreach(var letter in text)
        {
            inputTokens.Add(letter.ToString());
        }

        ApplyMerges(inputTokens);

        //Find the ids of the words in the vocab
        var ids = new List<int>();
        foreach(var token in inputTokens)
        {
            if (vocab.TryGetValue(token, out int id))
            {
                ids.Add(id);
            }
        }

        return ids;
    }

    void ApplyMerges(List<string> inputTokens)
    {
        foreach(var merge in merges)
        {
            string[] pair = merge.Split(' ');
            int n = 0;
            while (n >= 0)
            {
                n = inputTokens.IndexOf(pair[0], n);
                if (n != -1 && n < inputTokens.Count - 1 && inputTokens[n + 1] == pair[1])
                {
                    inputTokens[n] += inputTokens[n + 1];
                    inputTokens.RemoveAt(n + 1);
                }
                if (n != -1) n++;
            }
        }
    }

    private void OnDestroy()
    {
        engine?.Dispose();
    }
    
}

How did you export the octopus model?
Are you sure that it’s correct? Seems weird that a model would have 39 inputs…

In python notebook, the exported onnx model works fine. The input size changes according to the length of the input text for example:


In python it doesn’t show that the number of inputs is 39.

I’m referring to your screenshot of your model in unity


Something must be off when you exported the onnx file

I have converted the model using optimum. Export a model to ONNX with optimum.exporters.onnx (huggingface.co)

I have run the model with ORTforCausalLM and also in netron to check the model structure. Everything seems alright. However, I guess there must be something off with the cs file. I guess I have to do some tweak in that cs file to be compatible with octopus, right?
May be I need to change the tokenization process in unity as it should generate the model compatible input.

How was the tiny stories model been exported to ONNX? - AI Beta / Sentis - Unity Discussions
This issue is very relevant. I should look how to export the model with one input instead of 39 inputs.

@alexandreribard_unity what do you think about exporting the model with one input? Is there any specific settings that are required while exporting an LLM to onnx format?

https://pytorch.org/docs/stable/onnx_torchscript.html
Follow this example, it should be straight forward.
You need a dummy input and a handle to the NNModel