What are steps exactly?

There are actually two types of steps that are tracked by C# and python.

On the C# side, there are the steps that are 1:1 with a fixed update call i.e. this step counter is incremented in the Agent every time a fixed update occurs. The ‘max step’ field in the agent script corresponds to the maximum number of these steps that an agent can exist for in an environment.

Then, on the python side, there are are environment steps, each of which correspond to a decision interval (set in the decision requester component on an agent) elapsing. A decision interval is a set number of fixed updates (i.e. 5) that occur between an agent collecting an observation/changing its action. The ‘max_steps’ specified in the YAML config files corresponds to these steps.

So, if you have a max step=3000 set in your agent script with a decision interval=5, the max number of environment steps per episode is 600. The reason these are on different counters is because python/C# have different cadences. Sorry, I know this is confusing. Let me know if I can clarify anything.

TLDR;
C# Step = 1 fixed update
Python Step = 1 decision interval = k fixed updates

9 Likes