I created a ONNX model from the FaceBook/MusicGen-Small model at HuggingFace using optimum-cli in a Terminal.
This resulted in a ONNX folder with the following .onnx files:
build_delay_pattern_mask
decoder_model
decoder_model_merged
decoder_with_past_model
encodec_decode
text_encoder
After that I created the following code to load the model and set a prompt.
[RequireComponent(typeof(AudioSource))]
public class MusicGenExample : MonoBehaviour
{
[SerializeField]
private ModelAsset _ModelAsset;
private AudioSource _audioSource;
private Model _model;
private string _prompt;
private void Awake()
{
Initialize();
}
private void Initialize()
{
_model = ModelLoader.Load(_ModelAsset);
_prompt = "A happy, upbeat pop song";
}
}
But now I am lost on how to proceed further. How can I now add the prompt to the model and generate the desired output? And which of the generated .onnx files should I use to generate the output (a song using the desired prompt). So basically I’d like to know how I can figure out what .onnx file(s) I need to use, how to access them and what input and output parameters I should use. If I can get help to get this one running, converting other models from Hugging Face should hopefully be easier, once I understand the workflow. Thanks in advance!