Q-Learning implementation

Is there a way to implement a Q-Learning algorithm in Ml-agents?

We don't currently have a DQN implementation in ML-Agents, though we do support Soft Actor-Critic, which is similarly an off-policy algorithm that learns a Q-function.

You can plug in your own algorithms using the Python API (https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Python-API.md) or the Gym wrapper (https://github.com/Unity-Technologies/ml-agents/blob/master/gym-unity/README.md).

1 Like

We don't currently have an interface for writing a custom trainer using mlagents-learn.

If you want to use the low-level python API to interact with the environment, that's possible. There is an example of this in a Google colab here: https://colab.research.google.com/drive/1nkOztXzU91MHEbuQ1T9GnynYdL_LRsHG#forceEdit=true&sandboxMode=true

1 Like

@celion_unity @ervteng_unity

Massive necro but...ARE there any plans to document an interface for custom trainers? Unless I'm missing something crucial there's nothing stopping us from mimicking the structure of the current implementations, but an official way would obviously be preferable.

@lgendrot Custom trainers are something that we'd like to do in the future, but still no definite timelines. We took the first steps towards a plugin system here - we would take a similar approach for registration of custom trainers, but I think we'd need to do more working making our training code more modular and reusable.

1 Like

Ah cool, thanks for drawing my attention to that PR

is it me or the Gym Wrapper examples are deprecated since baselines doesnt work with TF2.0, the docs here should be updated to use stable-baselines3, which is giving me errors and I can't get help atm on my thread. @celion_unity