Neural network architecture for PPO

Hi

I was wondering if someone could provide the exact architecture for the PPO algorithm. Does it use two separate networks for the actor and critic part or does it have one network with different heads?

Also, for the “simple” visual encoder with 2 “num_layers” and 128 “hidden_units”, does that mean it has 2 CNN’s followed by 2 FC layers and the final layer with the number of output actions?

Thanks

You can load onnx models into https://netron.app/ for visualizing their networks.

1 Like