Hi,
I have a training job running right now that still seems to be making progress, but the entropy graph has gone into negative numbers. The documentation says:
Policy/Entropy (PPO; SAC) - How random the decisions of the model are. Should slowly decrease during a successful training process. If it decreases too quickly, the beta hyperparameter should be increased.
So I am having a hard time understanding what negative entropy means. Can someone please clarify?
Thank you
Dan
I see this was previously asked here: negative entropy? · Issue #1019 · Unity-Technologies/ml-agents · GitHub
Hi, It is possible for the Entropy to be negative in the case of Continuous Control. [See article.](https://en.wikipedia.org/wiki/Differential_entropy) That is because of the way it is defined. In the case of Discrete Control, the Entropy is always positive.
The linked article doesn’t really explain (to me at least) what the ramifications of the entropy being negative is. Can someone explain in simpler terms what it means?
Thank you
Dan