Agent training freezes at buffer_size steps

I’m training a self-driving AI car but the training stops every time buffer_size steps is reached (i.e. if buffer_size is 1000, then the training freezes at 1000 steps, 2000 steps, 3000 steps …). After 5 or 8 minutes later, the training will begin normally again until the next buffer_size is reach, and the process is repeated.

To my understanding, buffer_size is when the model is being updated. Does this mean that my model is simply too large (20x20 grid sensor with 3 tags with 2 stacks and some more raycasts), or can I improve it with my hyperparameters?

Here are my hyperparameters:

behaviors:
  CarAgentFollow:
    trainer_type: ppo
    hyperparameters:
      # Hyperparameters common to PPO and SAC
      batch_size: 4096
      buffer_size: 65536
      learning_rate: 3.0e-4
      learning_rate_schedule: linear
      # PPO-specific hyperparameters
      # Replaces the "PPO-specific hyperparameters" section above
      beta: 5.0e-3
      beta_schedule: linear
      epsilon: 0.2
      epsilon_schedule: linear
      lambd: 0.9
      num_epoch: 13
    # Configuration of the neural network (common to PPO/SAC)
    network_settings:
      vis_encode_type: simple
      normalize: true
      hidden_units: 128
      num_layers: 2

    # Trainer configurations common to all trainers
    max_steps: 3.5e6
    time_horizon: 512
    summary_freq: 10000
    keep_checkpoints: 5
    checkpoint_interval: 40000
    threaded: true
    init_path: null
   
    reward_signals:
      # environment reward (default)
      extrinsic:
        strength: 1.0
        gamma: 0.99

      # curiosity module
      curiosity:
        strength: 0.01
        gamma: 0.99
        learning_rate: 3.0e-4

environment_parameters:
  levels:
    curriculum:
      - name: ObstaclesDodge_Easy
        completion_criteria:
          measure: reward
          behavior: CarAgentFollow
          signal_smoothing: true
          threshold: 4.75
          min_lesson_length: 100
        value:
          sampler_type: uniform
          sampler_parameters:
            min_value: 1
            max_value: 2
      - name: ObstaclesDodge_Medium
        completion_criteria:
          measure: reward
          behavior: CarAgentFollow
          signal_smoothing: true
          threshold: 4.65
          min_lesson_length: 100
        value:
          sampler_type: uniform
          sampler_parameters:
            min_value: 2
            max_value: 3
      - name: ObstaclesDodge_Hard
        completion_criteria:
          measure: reward
          behavior: CarAgentFollow
          signal_smoothing: true
          threshold: 4.55
          min_lesson_length: 100
        value:
          sampler_type: uniform
          sampler_parameters:
            min_value: 3
            max_value: 5
      - name: ObstaclesDodge_Expert
        value:
          sampler_type: uniform
          sampler_parameters:
            min_value: 5
            max_value: 8

Any help will be greatly appreciated :slight_smile:

Also note that gail and BC makes a big difference in freezing time

without: ~5 mins
with: ~8 mins

I am seeing a similar behaviour when training the example environments. The editor/environment freezes every few seconds. Is this normal?
I can’t remember that this was happening when I used ML-Agents two years ago. Back then it just run smooth through the training on the same machine.

Have you found any solution for your problem yet? I have the same problem, training freezes when Buffer_Size is reached.

Sorry but I haven’t… I might change it to be computationally simpler (a lot of raycasts compared to grid sensors) and see how it goes

Hey, just wondering if you have a large buffer_size to batch_size ratio? It seemed to help if I lowered my buffer_size and increase batch_size within reasonable range. (note: I also removed the grid sensors and used raycasts instead, so maybe the less complex AI also cut down the freezing time)