ML Agents lots of zero observations

I am building an AI for a 2-4 player board game, but want to use the same brain for anywhere from 2 to 4 players. Because of this, if there is only 2 players, for example, all the data about the missing 2 players would just be fed to MakeObservations as just a bunch of zeroes (or negative ones or something). Is this okay to do?

Depends on what you mean by “ok”- You can do this and Unity/ML-Agents wont complain but your results will not be good.

Without any further information i cannot really tell you how to approach this but it sounds to me that you should read more into how to properly design states for Reinforcement Learning agents. If you


Because with your current approach your “AI” would not be able to transfer a move learned in a game with 2 players to a game with 4 players as the states would be completely different. This basically means that you’d have to train the same behaviour 3 times at which point it’d be more efficient to train 3 separate brains.

Or take it from a different view: If you have N states per player and you train a Brain with 4N where 2N states are always zero then you could be (about) twice as fast in your learning process by cutting the unneeded states.

What you can do to work around such things is to (for example) rework some states like:
“Is player 1 closer than X to me?”/“Is player 2 closer than X to me?”/“Is player 3 closer than X to me?” into “Is Any player closer than X to me?” with a second state: “How many players are closer than X to me?”
This way you can give a similar information to the agent while gaining independence of the player count.

This abstrahation process is not easy. Take your time to do this right as it will save you a lot of trouble later down the road.

Okay so I think I actually came up with a better solution to this, although have yet to implement it. The proper way of doing this is to use the BufferSensor component. This allows for a variable number of observations, typically to be used when there are a variable number of entities in the scene that you want to keep track of. And order doesn’t typically matter, although I will be adding player number to each entity so that they know the order moves go in. And this is a perfect use-case, because I have a variable number of possible opponents, all with the same set of data. So yeah, the BufferSensor is the solution to this problem. I’m glad that I realized this as it will make implementing this a lot easier and cleaner!