Curriculum Training not increasing lesson.

Hello,
I am trying to use curriculum training in combination with self-play. The problem is that during the whole training process, the lesson does not change. The parameter is shown in Tensorboard but does not change from the value defined in the first lesson. I did some research and it looks like in older versions it was necessary to call Academy.Reset(). Is something similar still necessary in the newer versions? I am using ML Agents Version 1.0.4 from the package manager and my version information from the console is:

Version information:
ml-agents: 0.19.0,
ml-agents-envs: 0.19.0,
Communicator API: 1.0.0,
TensorFlow: 2.3.0

I don’t use maxStep (I set it to 0) in the AgentClass, but I guess the threshold is depended on the maxStep set in the configuration file anyways. To access the parameter I only use a single line of code, I don’t have a custom Academy Class.

csharp* *float obstacleScale = Academy.Instance.EnvironmentParameters.GetWithDefault("obstacle_size", 0);* *
csharp* *environment_parameters: obstacle_size: curriculum: - name: Lesson0 # The '-' is important as this is a list completion_criteria: measure: progress behavior: CarAgentBehavior signal_smoothing: true min_lesson_length: 0 threshold: 0.1 value: 0.0 - name: Lesson1 # This is the start of the second lesson completion_criteria: measure: progress behavior: CarAgentBehavior signal_smoothing: true min_lesson_length: 0 threshold: 0.3 value: 5.0 - name: Lesson2 completion_criteria: measure: progress behavior: CarAgentBehavior signal_smoothing: true min_lesson_length: 100 threshold: 0.5 value: 10.0 - name: Lesson3 value: 12.5* *

EDIT:
So I am just trying to manage my lessons via C# for now, in a short test this worked quite well. The “issue” I have now is that it looks like the Academy.Instance.TotalStepCount are not the same as the steps in the Tensorboard. For example in Tensorboard the steps are already at 1.66 Mio while Academy.Instance.TotalStepCount is still at 1.45 Mio. Would be nice if someone can explain me the difference between them.

Hi, using “progress” as a measure uses the ratio of steps/max_steps. Would you mind providing your max_steps value from your config file?

For your question about steps does the answer from What are steps exactly? help?

Hi, for testing I did set it to 500k. As you can see in the screenshot of Tensorboard the training stopped automatically at 500k as expected. Thanks for the link, that explains why there is a difference.

Thanks for the information, would it be possible to include your whole conifg (.yaml) file here? I am trying to figure out if this is a config issue or if this is something on our end. Have you tried running the example environment that uses curriculum learning (WallJump) even if it is with fewer steps?

I attach my config file. No I haven’t tried running the example environment yet. But I could give it a try.
One thing that I have also noticed, is that I can only train in the UnityEditor. When I try to use mlagents-learn with the --env=<env_name> parameter it starts my game/build successfully (.app) but it does not start training. I only get this message after some time: The Unity environment took too long to respond.

I have the correct scene as only scene in the build settings and it doesn’t require any user interaction to start training. I think this must be a problem caused either through my project setup (I am using a Networking Solution for the multiplayer of my game) or my operating system macOS Catalina. Maybe it’s the same cause for the problem with curriculum training.

6391768–712630–Car-Curriculum.yaml.zip (1.21 KB)

Thanks so much. Can you try setting max_steps=50000 for testing, just to see if progress increases? I don’t expect anything useful to be trained but it will be useful to check if the curriculum is working. The file you sent had max_steps=5M so it never would have hit lesson 1 IIUC since 0.1 progress = 500K steps

Hi, I have changed the max_steps parameter for my testing yesterday, I did set it max_steps=500000 in my config file. Today I had another run with max_steps=5M also no increase in lesson. Should I do a test with the example environment with max_steps=50000?

Yes please or some smaller number of steps where we can be certain the 0.1 (10%) is reached. If that doesn’t work then I can try to recreate this possible bug with an example environment. Did you have similar trouble with config/ppo/WallJump_curriculum.yaml and the walljump environment since it also uses curriculum?