As an experiment, I decided to train ML Agent to be used as an opponent in my prototype RTS game.
As planned, the agent must control a group of units of various classes (warrior, archer, etc.).
I understand that I have to run each training session with a different number and composition of units.
To select a unit to give an order to, I use something like the following code:
public override void OnActionReceived(float[] vectorAction)
{
// Process actions
int selectedUnitIndex = vectorAction[0];
Vector3 targetPosition = new Vector3(vectorAction[1], vectorAction[2], vectorAction[3]);
Unit selectedUnit = GetUnitAtIndex(selectedUnitIndex); // gettinf the unit based on the index
selectedUnit.MoveTo(targetPosition); // Example command
}
Is this the right approach? What confuses me is that even if the unit properties are passed to observation, the agent will not be able to “understand” the unit with what properties it controls at the moment, since the list of units will be different each time during training.