After long iterations of my agents and environment I noticed that not every training with the same parameters converge. It is reasonable at some point however I don’t how can I increase the robustness of my trainings.
In the below graph you are seeing 3 trainings with exactly same parameters. How can I ensure my findings if results changes with every training ?
If your environment isn’t procedurally generated, you should be able to set a training seed to help with your tests.
mlagents-learn ... --seed SEED ...
The other thing you can do is run multiple learning experiments and do a standard statistical analysis to compare the runs and see if there are significant differences.
1 Like