Good practices to organize results and trained models

Hi,
Im using mlagents in my game and i have trained several agents with different observations, hyperparameters, game settings, etc. With so many trained models im getting confused and is hard to compare results from each variant.
I would like to know which metodologies, naming conventions, good practices do you use to organize your training workflow.

This tool might help
https://github.com/mbaske/command-config
It also has a “Notes” field for each training configuration.