I am currently training an agent to explore a platforming level. From my understanding, during training the entropy value is used as a measure of uncertainty to recognize when surprising events occur and thus reward curiosity. Is there a way to get a similar measure, but during inference?
If the reason for wanting this is relevant, I wish to use it so that it can notify me when something unexpected to it happens (e.g. falling through the floor, as I assume falling through the supposedly safe floor would be surprising to the trained agent).
Hi,
I think you might be conflating entropy (which is a measure of how random the agent acts) and curiosity (which is used to reward the agent for being “surprised”). I won’t claim to be an expert in entropy, but our documentation links to this blog post for an explanation of how it relates to RL training.
We don’t currently have any way to evaluate either entropy or curiosity at inference time; I’ll log it as a feature request, but no idea if or when we’ll be able to add it.
In the meantime, if you’re looking to apply this to QA for games, a related topic you might want to search for is “outlier detection”; here’s a master’s thesis that was one of the first hits for “outlier detection in games”.