I’m quite excited about the potential of what we can do with Unity Sentis. So I published the first version of Sharp Transformers. The C# version of the Hugging Face Transformers Library .
When you want to run a Transformer model with Unity Sentis, you need first to tokenize the text: since Transformers models can’t take a string as input. It needs to be translated into numbers.
This tokenizer is not part of the ONNX and must be coded in C#.
Therefore, we coded this part in C# to work with Unity.
For now, we coded only BERT models tokenizers but we plan to add more tokenizers (whisper, llama 2) in the upcoming weeks.
This allows you for now to run the tokenizer of Bert models in Unity Sentis..
If you have questions, feedback don’t hesitate to ask them here I’ll reply
Hi Simon (or is it Thomas?), so did you create the C# transformers library? Does Huggingface know about it yet? Maybe you work for them IDK. Or is Huggingface all volunteer open source programmers… even the transformers library?
Can you say any more about which parts you’ve implemented, which parts you plan to implement and which you don’t in the future?
Hey !
Very excited to dive in your work !
Came accross your substack post!
Quick note so far:
On opening project there are 2 errors like shown in the picture. I solved it just by changing the order of the MatMul2D method arguments in SentenceSimilarity.cs (line 92).
A few notes on the behavior:
I find it not 100% consistant:
“Go to red object” → robot moves to the red cube
“Go to blue object” → robot brings the blue cube
“Go to yellow object” → robot moves to the yellow cube
Any suggestions on why is that ?
Side question. Where should I start if I want to add to the robot feature the ability to reply to any question (like ChatGPT would) with text (or ideally voice)
Hey there
This error comes from the fact you use 1.1.1-exp2. We will update the demo this week.
For the not consistant behavior I’m agree we got some problems. The solution is:
Either modify the action definition (in the inspector) to be more precise. For instance testing with Go To instead of move to to see if the result is better in general
Either change model
For your side question. Well you need two things:
Speech to Text like whisper (we’re working on a STT tokenizer)
Generative text model like Llama
TSS (text to speech model)
The problem is that you’ll not be able to run all these 3 in local inference given that will take too much VRAM. For me the best is to get:
My question might find noob since I’m new to Sentis and AI. I tried running the sample project and it was successful but the problem is each time I enter some text and submit game freezes for about 300ms and then it works. Can you please tell me what could be the issue and how to fix it? I’m running on Unity 2022.3.11f1. Also I’m using Sentis version 1.2.0-exp.2. Here is the profiler screenshot.
Hi, I have signed up the Unity Beta program but I get stuck with this “Asset Packages/com.huggingface.sharp-transformers/LICENSE has no meta file, but it’s in an immutable folder. The asset will be ignored.”
Hi Simon, Very great work, did you add the llama2 tokenizer because i am stuck in using llama2 models in unity, will you please help me a little bit in this.
Thanks.
Can not find out how to get the tokenizer to work: Dictionary<string, Tensor> inputSentencesTokensTensor = SentenceSimilarityUtils_.TokenizeInput(input);
I would like to use the GroNLP/bert-base-dutch-cased tokenizer
That is fine. When an asset is in the StreamingAssets folder it is not viewable in the inspector. This folder is for large files you want to load by file name.
If you put the model in the main Assets folder then it should be viewable in the inspector.