Hi everyone,
I am trying to make a shooter game using the Self-Play feature. I followed closely with the Soccer example except that I added a vector observation which is an integer and is the ammocount. The rest of the configurations are the same, ie the raycasting, 2 vs 2 scenario etc.
However, when I hit train, I observe that all the agents were doing the same actions. Sort of like when I heuristically control the agents while not training to test the environment before training.
I removed the vector observation and tried training this time and it works now, the agents are doing different actions.
Can somebody please explain why adding the vector observations resulted in such a weird behavior?
Thank you