I have tried adding memory to my agent while using PPO and when the first ONNX model checkpoint is made this error is written to the console:
UserWarning: Exporting a model to ONNX with a batch_size other than 1, with a variable length with LSTM can cause an error when running the ONNX model with a different batch size. Make sure to save the model with a batch size of 1, or define the initial states (h0/c0) as inputs of the model.
I’m not quite sure how to interpret this so if anybody can give any insight that would be greatly appreciated ![]()
Here is my YAML config file:
behaviors:
SkripsieFighter:
trainer_type: ppo
hyperparameters:
batch_size: 512
buffer_size: 10240
learning_rate: 1.0e-4
beta: 5.0e-3
epsilon: 0.2
lambd: 0.9
num_epoch: 3
learning_rate_schedule: linear
network_settings:
normalize: false
hidden_units: 512
num_layers: 2
memory:
memory_size: 128
sequence_length: 64
reward_signals:
extrinsic:
gamma: 1.0
strength: 1.0
max_steps: 50000000
time_horizon: 1024
summary_freq: 50000
self_play:
save_steps: 100000
team_change: 500000
swap_steps: 50000
window: 30
play_against_latest_model_ratio: 0.5