Tips for creating a patrolling/wandering agent?

I’m trying to create a game that’s a very basic Among Us-type game. I have a bunch of buttons that can be turned off or on - I haven’t started any social deduction training, I just want to get my agents moving around properly.

Ideally, I want them to move around the map checking button spots to see if any are turned off, so they can turn on again. I’d like this movement to be semi-random, as if it was a real player. However, I’m struggling with the training. Due to the complexity of the level (there are 5 buttons and multiple walls), basic reinforcement learning took way too long without any significant results. I have since tried imitation learning, but have found issues. If I demo a specific path, then it works, but the agents always seems to follow that path, so it becomes too predictable for my liking. If I demo multiple paths, again the agent takes a very long time to train, and doesn’t really show good results (they seem to just run into walls).

Any suggestions on how I might handle this? My reward system is setup to provide +0.1 for every button turned on, and +1 when they are all turned on, with a constant small negative reward to encourage faster movement and prevent stalemates.

Thanks for any help.

Have you tried a curriculum? Perhaps start simple, training with small rooms and few buttons. Then gradually increase environment complexity.

1 Like

Yeah as mentioned earlier. Start smaller. Maybe try and teach the agent to reach buttons. Then teach agent to reach and press. Then make it add another button. Maybe one which is on and one which is off and so forth.

You can do the process manually or through config.

Thanks for the input - I will definitely give curriculum training a try. Before I dive in the deep end, is there a away I can avoid training specific paths? I eventually want the agent to behave as if they’re wandering from button to button, checking for un-pressed buttons rather than following one path. This way it doesn’t look too robotic when there are multiple agents.

I envision a platform which is freely walkable. Then a random spawn of the agent on the platform then a random spawn of the button in the platform. This randomness will teach to agent to not follow any specific paths but to search for the button and go to it.

After this is learned try spawning in random placed walls which the agent has to walk around to see buttons.

I think its just about spawning more buttons and walls then randomly.

Maybe look into CodeMonkeys video with the example of an agent pushing a button to get food. You won’t need the food but in any case the reward will be the same.
You just have to improve the complexity of that example by adding walls and buttons.