Hey,
just trained the PushBlock example environment with ‘save_replay_buffer’ set to ‘true’. During training the ‘last_replay_buffer.hdf5’ filesize was at 13MB. After training it decreased to 4KB.
I wonder what causes this huge compression after training? I mean how can you store an experience replay buffer with 50k observations in 4KB of storage?
Hi @BotAcademy ,
Can you be more specific about the steps to reproduce this? Are you interrupting training and using --resume?
One other thing to note is that if you hit Ctrl-C to interrupt training, hitting it again could interrupt the saving, which would lead to a truncated file.
When interrupting training, it works fine. Just experienced it when I finished a training run, without interruption. I’d still expect that the buffer gets correctly saved for the case that I want to improve an already trained agent and some point.
Can be easily reproduced by training the PushBlock environment with SAC and save_replay_buffer:true
OK, I can reproduce this quickly now (I added a log line after writing the buffer to output the file size. My guess is that we clear the buffer at the end of training before saving it. I’ll look more into it tomorrow.
1 Like
We do indeed clear the buffer at the end of training: ml-agents/ml-agents/mlagents/trainers/trainer/rl_trainer.py at release_5 · Unity-Technologies/ml-agents · GitHub
And this happens before the buffer gets saved, which explains the small file size. I’m not exactly sure what’s in the 4KB file, presumably metadata and padding from HDF5.
Is this causing any problems, or did it just look weird?
No problems because I’m not using the saved buffer (just tested it for understanding).
But if I’d use it I would expect a saved buffer at end of training in case I want to resume it with a greater step size at some point. If there are reasons to clear at the end even if ‘save_replay_buffer’ is set to true it’s totally fine, but I don’t see them, so its not the expected behaviour for me (but I am by far no expert on this).
Good point. I’ve got this logged as MLA-1251 in our internal tracker.
The clearing goes back to this change which prevents a memory leak from happening if some agents are learning and others are doing (python driven) inference. So it’s still necessary, but maybe we can do it at a different time (or just not append to the buffer during inference).
1 Like