The angry bots are back with a mix of heuristic and machine learning behaviour. Hardcoded high-level logic switches between patrol and attack modes based on opponent proximity. It provides direction vectors and target speed values for a deep learning policy, which in turn controls the robot’s actuators while reacting to environment physics.
I took some inspiration from the recent “CARL: Controllable Agent with Reinforcement Learning for Quadruped Locomotion” paper. My agent was trained with an initial round of imitation learning for mimicking a quadruped gait. Followed by a reinforcement learning phase in order to make the policy (user) controllable and robust against pertubations.
That would be awesome… but I’m afraid it’s just some FX (post-processing layer with edge detection). I didn’t use any visual observations for this one.
Yeah, that old version uses depth and motion vector textures as visual input. The environment is much simpler to handle though, no buildings etc. I haven’t tried visual training with the new one. Either way, those would be different policies, layered on top of one another. A visually trained navigator policy would generate direction vectors (avoiding obstacles, following other bots) which are then fed to the walker policy - at least that’s how I did it back then. The current version has some hard coded logic for calculating the direction vectors.
Thx for insight and some details.
Oh well, that would be indeed cool, having full vision mechanics.
But for game applications, sure, should simplify and use smokes and mirrors whenever possible.