SAC or PPO

Can I just train my agent by changing the trainer param to sac and adding this lines:
buffer_init_steps: 0
tau: 0.005
steps_per_update: 10.0
save_replay_buffer: false
init_entcoef: 0.5
reward_signal_steps_per_update: 10.0
?
Or do I need to change something in Unity or in the terminal?

You don’t need to change anything in Unity, just the trainer config. Might have to also remove some of the PPO-specific hyperparameters though (the script will error out and tell you which ones).

I did not have to I have both of the ppo and sac parametrs in the file and it looks like its trining. However I did take almost an hour to take 300000 steps, which is quite a lot, what should I change?

how long does PPO take for the same environment (300k steps)? SAC is generally slower than PPO

10 mins

Generally SAC speed is controlled by the network size (num_layers and hidden_units), and steps_per_update. Decreasing network size and increasing steps_per_update will speed up training. But SAC typically also takes fewer steps to achieve the same reward as PPO, so you might not need to run it as long in the first place.

Thx a lot, how much can I decrease network size, I have 512 hidden units and 2 layers, and how much increase step per update, i have 10 steps?

Hi @mateolopezareal , I’d suggest running it out until the reward reaches your desired reward before making changes. Increasing steps_per_update will speed up the step count, but decrease sample efficiency (i.e. take more steps to reach the same reward).

Network size is trial and error - if it works with 512 and 2 layers, try 256 and 2 layers. If the Q loss and policy loss in the plot keep rising, you’ll have to increase the network size.

Tips like these are worth gold for anyone just starting with machine learning.

Could you add this “rule of thumb” to the documentation? Maybe https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-Configuration-File.md ? It already has a recommendation for tuning the learning rate, I think it’ll fit well.

3 Likes