Questions about reward in mlagents-learn console

Does the Mean Reward log in console mean the sum of all the rewards(without discount) during one episode, or sum the reward contains discount(gamma)?

I stand to be corrected, but I’ve taken the Mean Reward to be the Mean reward achieved over all the previous episodes. So each time EndEpisode(…) is called, the total reward, for that episode, is the value that is used to make up the population of values for which the Mean is calculated.

Anyone care to chime in if I’m off the mark?