Advice on ML Agents Hide and Seek / Tag

Hello all!

For my first project in Unity ML Agents, I’m planning on implementing a hide and seek / tag type of game where the player needs to either hide from or find the agent in an obstacle course featuring moving doors, ramps, walls, etc. I’d like some advice on how the agent should be rewarded or penalized.

My current plan is to have the seeker agent have a raycast in every direction that it can use to see the player. For each second the player is in the agent’s raycast, it gets a small reward. The agent will also get a per-second reward based on how close it is to the player (smaller distance between player and agent = higher reward). The agent will get a large reward upon making contact with the player and it will end the episode.

The hiding agent will be similar. It will use raycasts to see the player, but in the hiding agent’s case, it will get a small reward for each second the player is NOT in the agent’s raycast. It will also get a per-second reward based on how far it is from the player (higher distance between player and agent = higher reward). The agent will get a large penalty upon making contact with the player and it will end the episode.

I’m curious as to what issues I can expect to run into or if there’s a better way to implement the agent’s reward system. Also, a question: Will the agent learn on its own how to navigate around obstacles, or will I need to implement a navmesh or some other way of pathfinding for the agent?

I appreciate any and all advice you have to share. Thank you!