Hi, how would I go about integrating the GPT-2 model linked here https://github.com/onnx/models/blob/main/text/machine_comprehension/gpt-2/model/gpt2-10.onnx (or any other text generation model) with Sentis? In particular, handling the string tokenization/conversions in the input and output?
You will have to implement the tokenization in C# unfortunately
For GPT2 it’s OpenAI GPT2
It’s Byte-Pair-Encoding it’s not too hard to implement
(Bpe Class (Microsoft.ML.Tokenizers) | Microsoft Learn)
string tensors/conversion is under development. Internally known as issue 109.