Inference working properly for multiple instances of the arena but not for one

Hi!
I trained an agent that plays Capture The Flag, what it does so far is capture the enemy flag and bring it back to the allied flag. While training, I am using 8 instances of the arena. The model is trained properly and can successfully capture a flag. But in inference, when having only 1 instance of the arena, the agent does not behave properly. If I use 8 instances for the inference as well, the agent behaves properly, exactly like in training.

Could something be done about this? Is it known? What is happening?
I marked this as bug but Iā€™m not sure if it is a bug.

1 Like

Hi!
I am having the same Problem / Question maybe someone finds this and has an answer for us.