ml agents, multipolicy self play.

m4l4 · July 29, 2020, 1:16pm

Hi everyone,
with self play is possible to train multiple agent in a competitive environment, but they all share the same goal, perception of the world etc. basically they share the same policy during episodes.

in the context of herbivores vs carnivores, herbivores have to learn to find plants, and avoid predators, while carnivores have to learn how to catch herbivores, they need different perception of the surrounding environment, and they need their own policy. Reward is life dependent, the older you get, the higher the score.
when an agent dies, AddReward(-1f), EndEpisode(), and it respawns, starting a new episode in the ALREADY running env. (no env reset, just the dead agent)!

Made a simple env named EnvSym, gave agents different behavior names (Carnivore and Herbivore), made a config file named EnvSym.yaml, and launched training.

unity does connect with 2 brains, names are correct (Carnivore, Herbivore), but their parameters are sort of default, not my config. Already happened once because the name of the config file didn’t match the name of the behavior.
tried to make 2 config files with the behaviors names, but i don’t know if there’s a command to call 2 different files at once.
mlagents-learn config/ppo/??? --run-id=EnvSym01.

is there a way to make different configs in the same file? should they be separated?

is it correct to split carnivores and herbivores into 2 different teams? they are not really a “team”, cooperation can be useful, but the goal is still “survive as much as you can on your own”.

and the most important question: is my project even possible right now with ml-agents?

celion_unity · July 29, 2020, 5:46pm

I talked to our resident self-play expert; this should work with self-play. Just be aware that the ELO rating may not be a useful metric in your scenario, since it was designed for zero-sum games, and it doesn’t sound like yours is.

For the config files, have a look at the StrikersVsGoalie config here: ml-agents/config/ppo/StrikersVsGoalie.yaml at release_4 · Unity-Technologies/ml-agents · GitHub

m4l4 · July 30, 2020, 12:32am

yes, i found by trial and error that i can put multiple behavior in the same config file, and the training starts just fine.

Thanks for the answer, i’ll put self play back on.

if i get something good i’ll let you know

Topic		Replies	Views
Multiple Agents with Different Behaviors Unity Engine ML-Agents , Question , com_unity_ml-agents	4	2449	June 9, 2020
Can I train multiple agents at the same time? Unity Engine ML-Agents , Question , com_unity_ml-agents	3	1868	November 26, 2022
Self-play and multi-agent reinforcement learning Unity Engine ML-Agents , com_unity_ml-agents	7	4856	March 17, 2021
Only one agent moves when using Self-Play for training Unity Engine ML-Agents , com_unity_ml-agents	7	2752	April 13, 2020
Two agents with different strategies are trained to confront each other Unity Engine ML-Agents , Question , com_unity_ml-agents	5	811	September 15, 2023

ml agents, multipolicy self play.

Related topics