Hi, the docs say self-play can also be used for asymmetric games, but I’m not sure when to use it and how I would implement rewards for that. Is there a similar rewarding convention like with symmetric games, e.g. win: 1, lose: -1? Specifically, I have a racing game with five agents / agent instances. Should I set sparse rewards, like the first agent to finish a lap gets the winner reward while the others receive penalties? Or is there room for differentiation, perhaps rewards depending on when (first, second, etc) an agent completes a lap?
So far, I’m training with five distinct behaviors (no self-play) and set frequent rewards based on the relative lead an agent has compared to all the others. I have to see how this goes - training with five behaviors is slow though. Is there a chance this could be simplified to training a single behavior with self-play, although it’s not strictly symmetrical? Thanks!
I’ll kick this over to the team to see if they have some guidance to pass along!
Thanks - for this specific case, I was able to get the behavior I was looking for by simply rewarding speed and penalizing being passed by another agent.
I’m still interested in how one would setup self-play for asymmetric games though.
In the asymmetric case, the competing agents could have different observations/actions. Typically, the problem would have the same reward structure as symmetric games i.e. (+1 winning/-1 losing). An example of this type of game is hide and seek or our StrikersVsGoalie environment.
In your game, are the 5 agents exactly the same? If they are, they can share a policy and you can definitely use symmetric self-play. If they aren’t, you can treat it as an asymmetric game and also use self-play!
So I thought I’d share this, since it turned out to be a fun project… Didn’t use self-play here after all, but training a single policy with PPO and just rewarding for speed and penalizing being passed by other agents resulted in some pretty cool racing behaviour nonetheless.
https://github.com/mbaske/ml-hover-bike-race
Awesome, looks great!