Train agent to command non-consistent group of units

Just an idea,
1 vector to represent the “unit type” (warrior, archer, caster, etc.) Keep an array of Unit Types.
1 vector to represent the “unit action” (move, shoot, cast spell, dance, etc…) Keep an enum/array of ALL possible actions for array of Unit Types.
3 vector to represent the “target position”

Then on action, give small reward if the “selected action” can be performed by the “selected unit type” or maybe a small penalty if it can’t. So all 3 could move, but only 0 & 1 units can shoot, only 3 can cast spell. (This look up matrix would be the pain in the butt, or at least a large case statement)

Then, if it is one of the valid combinations of unit & action, call your actual code to perform the action and pass in the 3 target position numbers. Reward/Penalty/Nothing could then be applied in the “action performing code” again to signify success in the action; the spell hits/misses, can/cannot move to new position and such.

Anyhow, if I were gonna give it a go, that is what I would try.

1 Like