Hi, I am trying to build a MotoGP racing game and using imitation learning to achieve my objective. I followed this video of CodeMonkey . I first created the .demo file by training it by myself. But when I try to train it using imitiation learning by running the following command:
mlagents-learn config/MotoGP.yaml --run-id=Imitation9
and then press the play button , the game kind of runs for a second or two and then stops and I get the above error. This is the error in full detail:
Connected new brain: BikerBehavior?team=0
[INFO] Hyperparameters for behavior name BikerBehavior:
trainer_type: ppo
hyperparameters:
batch_size: 10
buffer_size: 100
learning_rate: 0.0003
beta: 0.0005
epsilon: 0.2
lambd: 0.99
num_epoch: 3
shared_critic: False
learning_rate_schedule: linear
beta_schedule: constant
epsilon_schedule: linear
network_settings:
normalize: False
hidden_units: 128
num_layers: 2
vis_encode_type: simple
memory: None
goal_conditioning_type: hyper
deterministic: False
reward_signals:
extrinsic:
gamma: 0.99
strength: 1.0
network_settings:
normalize: False
hidden_units: 128
num_layers: 2
vis_encode_type: simple
memory: None
goal_conditioning_type: hyper
deterministic: False
gail:
gamma: 0.99
strength: 0.5
network_settings:
normalize: False
hidden_units: 128
num_layers: 2
vis_encode_type: simple
memory: None
goal_conditioning_type: hyper
deterministic: False
learning_rate: 0.0003
encoding_size: None
use_actions: False
use_vail: False
demo_path: Trainer/MotoGPTrainer.demo
init_path: None
keep_checkpoints: 5
checkpoint_interval: 500000
max_steps: 500000
time_horizon: 64
summary_freq: 10000
threaded: False
self_play: None
behavioral_cloning:
demo_path: Trainer/MotoGPTrainer.demo
steps: 0
strength: 0.5
samples_per_update: 0
num_epoch: None
batch_size: None
D:\MotoGP Demo\venv\lib\site-packages\mlagents\trainers\torch_entities\utils.py:289: UserWarning: The use of `x.T` on tensors of dimension other than 2 to reverse their shape is deprecated and it will throw an error in a future release. Consider `x.mT` to transpose batches of matrices or `x.permute(*torch.arange(x.ndim - 1, -1, -1))` to reverse the dimensions of a tensor. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\TensorShape.cpp:3679.)
torch.nn.functional.one_hot(_act.T, action_size[i]).float()
[INFO] Exported results\Imitation9\BikerBehavior\BikerBehavior-128.onnx
[INFO] Copied results\Imitation9\BikerBehavior\BikerBehavior-128.onnx to results\Imitation9\BikerBehavior.onnx.
Traceback (most recent call last):
File "C:\Users\G524\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\G524\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "D:\MotoGP Demo\venv\Scripts\mlagents-learn.exe\__main__.py", line 7, in <module>
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents\trainers\learn.py", line 264, in main
run_cli(parse_command_line())
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents\trainers\learn.py", line 260, in run_cli
run_training(run_seed, options, num_areas)
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents\trainers\learn.py", line 136, in run_training
tc.start_learning(env_manager)
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
return func(*args, **kwargs)
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents\trainers\trainer_controller.py", line 175, in start_learning
n_steps = self.advance(env_manager)
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
return func(*args, **kwargs)
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents\trainers\trainer_controller.py", line 250, in advance
trainer.advance()
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents\trainers\trainer\rl_trainer.py", line 302, in advance
if self._update_policy():
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents\trainers\trainer\on_policy_trainer.py", line 111, in _update_policy
update_stats = self.optimizer.bc_module.update()
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents\trainers\torch_entities\components\bc\module.py", line 95, in update
run_out = self._update_batch(mini_batch_demo, self.n_sequences)
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents\trainers\torch_entities\components\bc\module.py", line 178, in _update_batch
bc_loss = self._behavioral_cloning_loss(
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents\trainers\torch_entities\components\bc\module.py", line 118, in _behavioral_cloning_loss
one_hot_expert_actions = ModelUtils.actions_to_onehot(
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents\trainers\torch_entities\utils.py", line 288, in actions_to_onehot
onehot_branches = [
File "D:\MotoGP Demo\venv\lib\site-packages\mlagents\trainers\torch_entities\utils.py", line 289, in <listcomp>
torch.nn.functional.one_hot(_act.T, action_size[i]).float()
RuntimeError: Class values must be smaller than num_classes.
You can look into the .yaml file by observing the above code. I tried to read a similar problems posted by other and from there I tried experimenting by removing the behavioral_cloning: part away from the .yaml file. There is no error in the command prompt console , the game continues to stay in the play mode but the bike doesn’t move an inch. Can anyone here help me how to solve this?
Also the ML agents package has not seen any update since 2021. Why is that so?