Hello,
I’ve been training an agent to do a physic-related task in Unity using mlagents-learn, and as the task I’m interested about has a sparse-reward setup, I tried using curiosity.
What I observed during the training using curiosity is that the quantity Curiosity Value Estimate keeps increasing (goes from 0 at the beginning of training to 4e4 at about 1M steps). The curiosity reward slowly drops to 0, and the agent doesn’t seem to explore the environment.
Any suggestions about what I’m doing wrong ?