Imitation Learning isn't working as expected

First off, I am assuming that step 3 in offline training in ml-agents/docs/Training-Imitation-Learning.md at 0.6.0 · Unity-Technologies/ml-agents · GitHub is outdated. I therefore skipped this step. If this step is not outdates and I need to do something, please let me know.

I then followed the steps and wrote this is in the config file.

Player 3 AI:
    max_steps: 5.0e4
    batch_size: 128
    buffer_size: 2048
    beta: 1.0e-2
    hidden_units: 256
    summary_freq: 2000
    time_horizon: 64
    num_layers: 2
    behavioral_cloning:
        demo_path: Assets/Demonstrations/demonstration 2.demo
        strength: 1
        steps: 200000

Also the image of my agent script is attached at the bottom. However, when I train the

However, when training the agent, it doesn’t seem to follow the behavior of the demonstration at all. I could understand needing more data for the demonstration, but it looks as if its just acting randomly, not even trying to match the demonstration. Also, I don’t seem to understand why you would need to train an agent who is trying to duplicate a behavior (behavioral cloning)- wouldn’t the agent have all of the data needed? Finally when I get the brain from the player, the actions are just as sporadic as when I’m training the agent. Is step 3 important? Am I configuring the config incorrectly? Do I just need more training data or even less vector observations (currently have 22)? Any other ideas? Should I try online training instead?

Thanks in advance!!!

Hello,

The quality of the demonstration file is bound by how informative the observations being used are. If it is not possible to map from observations → actions given the information provided, then it is unlikely that the agent can learn well from demonstrations. Can you provide a little more information about the problem and the way you have broken it into obs and actions?

I cant get imitation to work either, it doesnt follow the demo in the slightest. ML in general has been severely disappointing, cant even train a bot to play a simple 2d game.

Dude :slight_smile: It takes patience and lots of time. :slight_smile:

Notice you are trying to get behavior cloning to work with just one single demo recording ???

I get decent results for simple games when using 50 to 100 demo recordings.

Also try GAIL.