What is the meaning of "network_settings -> normalize" in Configurations

For example, in the walker environment, if the normalization is set to false, it will learning nothing. In the C # code, the input is not scaled between 0 and 1, so the normalization should be done by network_ Settings > normalize, right? That’s where I’m confused.
if normalize is true, is that means the start of network is

torch.nn.BatchNorm1d(obs*_features)*
torch.nn.Relu() ?

I’m using my own RL code and stuck here. if it means BatchNorm1d(), due to the need to interact with the environment, the network can only get one OBS from the environment each time, so the running average and variance calculated in this way will have a very large error, especially at the beginning. I did it without success.

Did I get the wrong understanding, or that the official code first adopted random actions to get a more appropriate running average and variance (I didn’t fully understand the official code)?

Thanks for the question. The normalization we use is based on a running average across all previous data in a session (not just one batch). I have included some links below to where and how the normalization is done but please let me know if you have other questions.

Config docs
From: ml-agents/docs/Training-Configuration-File.md at main · Unity-Technologies/ml-agents · GitHub
“normalization is based on the running average and variance of the vector observation”.

Specific
implementations
Tensorflow: ml-agents/ml-agents/mlagents/trainers/tf/models.py at a14730fb4aa16820fe4b4a295a49e9c2c56d5b03 · Unity-Technologies/ml-agents · GitHub

Torch: ml-agents/ml-agents/mlagents/trainers/torch/encoders.py at a14730fb4aa16820fe4b4a295a49e9c2c56d5b03 · Unity-Technologies/ml-agents · GitHub

When we process the data
[PPO] ml-agents/ml-agents/mlagents/trainers/ppo/trainer.py at 084d1c8b1f80715fb5590905c399c093ab22f937 · Unity-Technologies/ml-agents · GitHub
[SAC] ml-agents/ml-agents/mlagents/trainers/sac/trainer.py at 084d1c8b1f80715fb5590905c399c093ab22f937 · Unity-Technologies/ml-agents · GitHub

1 Like

Thank you for your help. I see.

During Inference is the “test data” being normalised as well? Does the onnx file keeps a record of the average and variance from the training data and utilises that? Thanks!