Hi everyone,
with self play is possible to train multiple agent in a competitive environment, but they all share the same goal, perception of the world etc. basically they share the same policy during episodes.
in the context of herbivores vs carnivores, herbivores have to learn to find plants, and avoid predators, while carnivores have to learn how to catch herbivores, they need different perception of the surrounding environment, and they need their own policy. Reward is life dependent, the older you get, the higher the score.
when an agent dies, AddReward(-1f), EndEpisode(), and it respawns, starting a new episode in the ALREADY running env. (no env reset, just the dead agent)!
Made a simple env named EnvSym, gave agents different behavior names (Carnivore and Herbivore), made a config file named EnvSym.yaml, and launched training.
unity does connect with 2 brains, names are correct (Carnivore, Herbivore), but their parameters are sort of default, not my config. Already happened once because the name of the config file didn’t match the name of the behavior.
tried to make 2 config files with the behaviors names, but i don’t know if there’s a command to call 2 different files at once.
mlagents-learn config/ppo/??? --run-id=EnvSym01.
is there a way to make different configs in the same file? should they be separated?
is it correct to split carnivores and herbivores into 2 different teams? they are not really a “team”, cooperation can be useful, but the goal is still “survive as much as you can on your own”.
and the most important question: is my project even possible right now with ml-agents?