Using Collect Observations in Self-Play leads to issues

Hi everyone,

I am trying to make a shooter game using the Self-Play feature. I followed closely with the Soccer example except that I added a vector observation which is an integer and is the ammocount. The rest of the configurations are the same, ie the raycasting, 2 vs 2 scenario etc.

However, when I hit train, I observe that all the agents were doing the same actions. Sort of like when I heuristically control the agents while not training to test the environment before training.

I removed the vector observation and tried training this time and it works now, the agents are doing different actions.

Can somebody please explain why adding the vector observations resulted in such a weird behavior?

Thank you

Are there any errors in the console?

You can also check out our tennis environment for an example of self-play with ray casts, though this should not cause any strange behavior.

Hi, I managed to fix the issue by removing the vector observations, therefore the agents only use raycasting to observe the environment. thanks!