How was the tiny stories model been exported to ONNX?

I have problems understanding, what and why you did (and how you did it) to the model structure of the original in roneneldan/TinyStories-33M.
In the original TinyStories the ONNX model seems to have following structure:
Three INPUTS (input_ids, attention_mask, Position_ids) and only one OUTPUT (logits) - this is exactly the structure of the GPT-NEO that is the basis of the model.

Whereas sentis tiny stories model has got only one INPUT (Input index) and NINE outputs (values and keys).

To check the difference I used python optimum to export to ONNX model from the original roneneldan/TinyStories-33M model (pt-model) with: optimum-cli export onnx --opset 15 --model model --task text-generation model_tinyorig_ONNX

Reason why I did that is: I trained a new GPT-NEO model (which has got the same structure as roneneldan/TinyStories-33M) exported it to ONNX with the same cli command as above and tried to use it instead of the sentis-model in the same sample code of tiny stories.cs and get the error (as with the original model from roneneldan), that the input dimensions of the model are not correct (which is natural since the models input and output structure differs).
(error message: AssertionException: ModelOutputs.ValueError: inputs length does not equal model input count 1, 3 Assertion failure. Value was False Expected: True)

Is there a guide or description how exactly you exported the model to ONNX / further explanation of the process?

Thank you in advance!

The TinyStories model of the Sentis Sample on HuggingFace was changed to be compatible with Sentis. Answering your question on ONNX exporting, we have a guide on how to Export and convert a file to ONNX

Thank you very much for the answer.
Is there any documentation /example on how exactly you exported the tiny stories onnx model ( which parameters were changed)?
The general description is clear to me but I cannot reproduce the export without the parameters.
To be more specific, I would be interested in acsnippet like the following from the pytorch page with your parameters:

Input to the model

x = torch.randn(batch_size, 1, 224, 224, requires_grad=True)
torch_out = torch_model(x)

Export the model

torch.onnx.export(torch_model, # model being run
x, # model input (or a tuple for multiple inputs)
“super_resolution.onnx”, # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=10, # the ONNX version to export the model to
do_constant_folding=True, # whether to execute constant folding for optimization
input_names = [‘input’], # the model’s input names
output_names = [‘output’], # the model’s output names
dynamic_axes={‘input’ : {0 : ‘batch_size’}, # variable length axes
‘output’ : {0 : ‘batch_size’}})

Hey did you find anything about it?