I have been calling “EndEpisode” at the end of "OnActionReceived:. This is what most of the examples do. However this generates an extra call to “CollectObservations” which is called in:
-
OnActionReceived
-
EndEpisode
-
EndEpisodeAndReset
-
NotifyAgentDone
-
…
-
CollectObservations
This generates a stack trace where there is a CollectObservations call inside OnActionReceived. I don’t like this extra CollectObservations call because it happens after my agent has applied the actions for this step to the environment. The observations are now different than those which generated this set of actions.
So when is the best time to check if the Episode should terminate and call EndEpisode?
I have considered using Academy.PreAgentStep. But that doesn’t feel right either because I would end up with a code flow like this:
-
OnActionReceived
-
EndEpisode
-
…
-
CollectObservations
-
…
-
OnEpisodeBegin
-
Reset or generate new start conditions
-
CollectObservations
-
OnActionReceived
Is there a recommended best practice for when to call EndEpisode?