Hi all,
I’m noticing that during runtime, some Agent ID information is being passed to both the DecisionStep and TerminalStep.
To be more specific, in Python I can query DecisionStep.agent_id
and TerminalStep.agent_id
, and find that some cases some Agent IDs are on both lists.
Which is the correct data point for the duplicated Agent ID?
Is this a possible bug?
See image below of an example of the 3DBall with 12 agent IDs.
I notice that the DecisionStep has a 0 reward for Agents 1, 7 and 9 but a different observation vector is seen on both objects. I would assume that the observation on the TerminalStep would be the actual terminal step.
Is my assumption correct?
Thanks!