Hi, I’m working on a project that requires agents switching between training and inference model while the training session is running. I found it easy to switch mode while running by simply change the behavior type. But the agent actually loads the selected model before the unity instance start, not keeping the training neural network policy before switching. Is it possible to keep current neural network model while switching from training to inference? Seems I have to modify the source code but I can’t find where to start with.
Hi,
It’s possible to load .nn files from disk during training (not .onnx files yet, though). There is some example code for this here.
As of the last release, we don’t create .nn files during training, only at the end. The next release (and currently on the master branch) will create .nn files when it creates snapshots.
So putting those two together, I think you can do something like
- Determine which .nn file created by a snapshot to load
- Use the sample code to load the .nn file and produce a NNModel instance
- Call Agent.SetModel with the new NNModel, and set the behavior type to InferenceOnly,.
Thank you for your response. I got the idea by setting model. So currently we can’t use the model snapshots when switching cause it doesn’t generate .nn untill training ends. And your team is working on this probelm right? It’s a great news to hear about. By the way is current master branch already able to do this or still in progress? I’m willing to try out now if the answer is yes.
Yes, the feature was added on the master branch in Convert checkpoints to .NN by sankalp04 · Pull Request #4127 · Unity-Technologies/ml-agents · GitHub.
Thank you so much! I’m looking into it now.
Hi. I’m trying this new feature in master branch now. It seems working fine. The only problem is that it doesn’t generate snapshots while switching behavior type. Is there any way to generate snapshot and .nn models while switching from training to inference? Thank you.
Sorry, this isn’t possible right now; snapshots are only generated at regular intervals during training (controlled by checkpoint_interval in the trainer config).
Okay. Thank you anyway.