Turn Based Training and Parallelism

Hi,

For non-turn based training (most of the examples provided) I can parallelize training by including multiple training environments into a single scene. This works fine with the single Academy instance that manages its own stepping.

In a turn based environment where I manage stepping myself by calling Academy.Instance.EnvironmentStep() after every turn, it seems that I cannot parallelize in the same way. The single Academy instance is shared across all environments and, of course, the different environments might go at different speeds.

Hmm… but now that I write this. I guess I just need one manager that waits for all environments to take a turn, and then advance the step. And that should work find. All the environments will move in sync, but, of course, the moves will be different.

Does that seem reasonable? Is that the right way to go about this?

Thank you
Dan

Hi Dan - you won’t need to include multiple training environments in a single scene. In the trainer, you can use --num-envs=N to spin up multiple environments during training. This should make it simpler. See link to mlagents-learn params:

See Training Using Concurrent Unity Instances

https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-ML-Agents.md

Thank you. Sorry, I should have mentioned that I knew I could do that. But the envs=N option seems much more heavy weight, as each N brings up an entire unity process. While I could easily train ~20 separate instances inside one Scene, I don’t think my machine would like me launching 20 copies of the process at the same time. :slight_smile:

Ah i see. If that is the case, you may need to implement a solution in the way you described it and once all the turns are complete, call the academy step. How are you currently implementing the academy step?

Right now I basically have:

void Update() {

  If (player1Turn) {
    player1.RequestDecision();
    Academy.Instance.EnvironmentStep()
  }

  If (player2Turn) {
    player2.RequestDecision();
    Academy.Instance.EnvironmentStep()
  }
}

Which lets me externally set when it is each player’s turn. So you can see how having multiple instances of this won’t work. Instead something like this, in an external GameManager…

void Start() {
  // launch multiple instances of learning environment from prefab
}

void Update() {
  // check all instances until they report turn done
  Academy.Instance.EnvironmentStep()
}

// And each instance would so something like:

void Update() {
If (player1Turn) {
    player1.RequestDecision();
    turnDone = true;
  }

  If (player2Turn) {
    player2.RequestDecision();
    turnDone = true;
  }
]