Hello, what does cumulative episode reward on mlagents mean? The mean of all episodes up to a point in training or the mean of the last 100 episodes?
If you’re talking about reward that’s displayed in tensorboard, then it’s a mean cumulative reward per
summary_freq steps, it’s a parameter in your yaml config.
Also note, that curriculum uses a different metric, it checks reward per min_lesson_length steps, which is a parameter of a particular lesson.
1 Like