First off, I am assuming that step 3 in offline training in ml-agents/docs/Training-Imitation-Learning.md at 0.6.0 · Unity-Technologies/ml-agents · GitHub is outdated. I therefore skipped this step. If this step is not outdates and I need to do something, please let me know.
I then followed the steps and wrote this is in the config file.
Player 3 AI:
max_steps: 5.0e4
batch_size: 128
buffer_size: 2048
beta: 1.0e-2
hidden_units: 256
summary_freq: 2000
time_horizon: 64
num_layers: 2
behavioral_cloning:
demo_path: Assets/Demonstrations/demonstration 2.demo
strength: 1
steps: 200000
Also the image of my agent script is attached at the bottom. However, when I train the
However, when training the agent, it doesn’t seem to follow the behavior of the demonstration at all. I could understand needing more data for the demonstration, but it looks as if its just acting randomly, not even trying to match the demonstration. Also, I don’t seem to understand why you would need to train an agent who is trying to duplicate a behavior (behavioral cloning)- wouldn’t the agent have all of the data needed? Finally when I get the brain from the player, the actions are just as sporadic as when I’m training the agent. Is step 3 important? Am I configuring the config incorrectly? Do I just need more training data or even less vector observations (currently have 22)? Any other ideas? Should I try online training instead?
Thanks in advance!!!