Observation size VS Character type size

I have a board game that is fairly complex and have completely remodeled my observation space architecture 3 times.

The first was to create an observation for every square (40) with around 400 one-hot “characters” that can be in each square

The second was to only create an observation for each “character” and not for each location, reducing the observation space down to 14 but the amount of “characters” including their location up to 16000 (Is that really inefficient for One-hot?)

Then I thought that separating data will be better learning for the Agent, so I put the data in to “Type”, “Location” and “Trait” using 3 observations per possible piece (13 max, 39 Observations) and ignoring empty spaces, my initial thought was that it would be able to better learn the effects of each, all type 1 characters move like this, all trait 3 pieces are immobilized etc. But then again opens up more opportunity for the Agent to confuse the reward / punishment across the observations and even attribute traits of one piece to another.

I ran 1.6 million test games on the first structure and got already pretty good evidence that it was learning well. Before committing to trying to get 10’s or hundreds of millions of games tested for my Agent. I want to make sure the best option is in place

There must be some best practice that skews either towards separating every possible combination in to its own “character” (16000 variants) or increasing observations to matrix out the variables or some middle ground? My three structures are maybe all three extremes although I could definitely matrix out variables further and fill a quaternion (Which for MLAgents is one Int and 4 positions x,y,z,w worth of information) But would increase observation size to 65.

I also had in my 13x3 architecture just 39 individual intergers every observation call, I got to wondering (this is probably very ignorant) if I made them a vector3, would the Agent group the info as relevant to one piece of information as the ‘type’, ‘location’ and ‘traits’ should only affect the consideration of that piece.

I promise I did look for these kinds of answers! Help a noob out! :slight_smile:

TLDR:

How many is too many for one-hot?
Matrixing data vs Single giant list?
Observation lean or “Character type” lean
Can vector3 / Quaternion work well for grouping character data in a matrix

TIA

Hi Dscvr,

According to the best practices in the docs ‘categorical variables’ should always be encoded in one-hot. I believe that has to do with the fact that all observations are normalized at some point. (this is conjecture, I may be wrong here)

Whatever is easier for you to get into the observational space or understand conceptually, all observations will be converted to a 1d ordered list of floats when fed to the algorithm anyway.

No, the agent never infers relationships between observations based on grouping because there will be none after the observations are processed.

Again, sure, if it helps you out. But it won’t make any difference to the training (assuming correct implementation).

Hope this helps!

Hey Luke,

Yes really helpful. Ty!