Does BERT work with Barracuda?

Hello!

※ Problem
I tried to get Barracuda to work using BERT(Bidirectional Encoder Representations from Transformers).
However, when importing ONNX, an error is output.

※ Question

  1. Can I run BERT on Barracuda?
  2. Can I use Transformer with Barracuda?

※ Environment
Unity 2020.3.29f1
Barracuda 3.0.0

Thanks.

2 Likes

I also need this information…

Currently it does not work no. You might have some luck if input shape is fixed.
Simpler transformers like torch MultiheadAttention work with fixed input shapes, but larger models like BERT do not work for this barracuda version.

I was trying, for a university thesis, to introduce BERT within Unity or otherwise a social agent.
Do you have any suggestions?
I have tried several times to load the .onnx model but always had various problems…now I am in a dead end T_T

Hi there,
You could use the ONNX runtime ML.NET nuget package. Be careful to export your BERT model with a higher opt version. Then you will need an adapted tokenizer that works with subwords, no official package is available, but a simple online research will yield solutions in that regard.
Good luck !

1 Like

Yes, ONNX runtime is the best bet here

Do you happen to have any particular guide that can help me?
If I could just get the model uploaded to unity, that would already be an achievement.

I saw this video:

https://www.youtube.com/watch?v=Yf7VXj1etdw

Should I do as he did?

Thank in advance

Hey,
What is done in the video feels similar to what I did. My workflow is: use python transformers library (by HuggingFace) to load BERT model and train it on a downstream task, export it to ONNX, then load the .onnx model with onnx runtime & ML.NET, and use it for inference.
See Using Huggingface Transformers with ML.NET | Rubik’s Code (rubikscode.net) for example.
HuggingFace has some documentation too (see : Export Transformers Models (huggingface.co)).
I did not find an exhaustive tutorial that suited my needs, but I believe the first link may help you. By the way, they are other ways of loading a .onnx model in Unity than what is presented in the first link.

Hope it helps

1 Like

Nothing to do, i’m going crazy.

Check this document. I used it as a reference and successfully implemented inference for a BERT model.

Here is my code:

using BERTTokenizers;
using Microsoft.ML.Data;
using Microsoft.ML.OnnxRuntime;
using Microsoft.ML.OnnxRuntime.Tensors;
using System;

namespace MyApp
{
    internal class Program
    {
        static void Main(string[] args)
        {
            var sentence = "I'm sorry for what i said";
            //var sentence = "{\"question\": \"Where is Bob Dylan From?\", \"context\": \"Bob Dylan is from Duluth, Minnesota and is an American singer-songwriter\"}";
            Console.WriteLine(sentence);

            var tokenizer = new BertUncasedBaseTokenizer();
            var tokens = tokenizer.Tokenize(sentence);
            //Console.WriteLine(String.Join(", ", tokens));
            var encoded = tokenizer.Encode(tokens.Count, sentence);

            var bertInput = new BertInput()
            {
                InputIds = encoded.Select(t => t.InputIds).ToArray(),
                AttentionMask = encoded.Select(t => t.AttentionMask).ToArray(),
                TypeIds = encoded.Select(t => t.TokenTypeIds).ToArray(),
            };


            var modelPath = @"F:\AI\Model\Bert\bert_goemotion_6500.onnx";

            using var runOptions = new RunOptions();
            using var session = new InferenceSession(modelPath);

            // Create input tensors over the input data.
            using var inputIdsOrtValue = OrtValue.CreateTensorValueFromMemory(bertInput.InputIds,
                  new long[] { 1, bertInput.InputIds.Length });

            using var attMaskOrtValue = OrtValue.CreateTensorValueFromMemory(bertInput.AttentionMask,
                  new long[] { 1, bertInput.AttentionMask.Length });

            using var typeIdsOrtValue = OrtValue.CreateTensorValueFromMemory(bertInput.TypeIds,
                  new long[] { 1, bertInput.TypeIds.Length });

            var inputs = new Dictionary<string, OrtValue>
            {
                { "input_ids", inputIdsOrtValue },
                { "attention_mask", attMaskOrtValue },
                { "token_type_ids", typeIdsOrtValue }
            };

            // Run session and send the input data in to get inference output.

            //error here
            using var output = session.Run(runOptions, inputs, session.OutputNames);
            var predictedClassIndex = GetMaxValueIndex(output[0].GetTensorDataAsSpan<float>());

            var emotions = new List<string>
            {
                "admiration", "amusement", "anger", "annoyance", "approval", "caring", "confusion",
                "curiosity", "desire", "disappointment", "disapproval", "disgust", "embarrassment",
                "excitement", "fear", "gratitude", "grief", "joy", "love", "nervousness", "optimism",
                "pride", "realization", "relief", "remorse", "sadness", "surprise", "neutral"
            };

            var predictedLabel = emotions[predictedClassIndex];

            Console.WriteLine($"Predicted Emotion: {predictedLabel}");

        }

        public struct BertInput
        {
            public long[] InputIds { get; set; }
            public long[] AttentionMask { get; set; }
            public long[] TypeIds { get; set; }
        }

        static int GetMaxValueIndex(ReadOnlySpan<float> span)
        {
            float maxVal = span[0];
            int maxIndex = 0;
            for (int i = 1; i < span.Length; i++)
            {
                var v = span[i];
                if (v > maxVal)
                {
                    maxVal = v;
                    maxIndex = i;
                }
            }
            return maxIndex;
        }


    }
}

btw, this model is for text-classification task.