RuntimeError when use imitation learning

Hello. I am trying to use imitation learning. However when I try to enter the recorded .demo file the following error, related to torch, appears. Thanks.

RuntimeError: cannot sample n_sample > prob_dist.size(-1) samples without replacement

 Version information:
  ml-agents: 0.27.0,
  ml-agents-envs: 0.27.0,
  Communicator API: 1.5.0,
  PyTorch: 1.9.0+cu111
[INFO] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
[INFO] Connected to Unity environment with package version 2.0.0-pre.3 and communication version 1.5.0
[INFO] Connected new brain: ZARAgoal?team=1
[WARNING] Deleting TensorBoard data events.out.tfevents.1624485266.AndreasPC.18212.0 that was left over from a previous run.
[INFO] Hyperparameters for behavior name ZARAgoal:
        trainer_type:   ppo
        hyperparameters:
          batch_size:   128
          buffer_size:  2048
          learning_rate:        0.0003
          beta: 0.01
          epsilon:      0.2
          lambd:        0.95
          num_epoch:    3
          learning_rate_schedule:       linear
        network_settings:
          normalize:    False
          hidden_units: 256
          num_layers:   2
          vis_encode_type:      simple
          memory:       None
          goal_conditioning_type:       hyper
        reward_signals:
          extrinsic:
            gamma:      0.99
            strength:   1.0
            network_settings:
              normalize:        False
              hidden_units:     128
              num_layers:       2
              vis_encode_type:  simple
              memory:   None
              goal_conditioning_type:   hyper
          gail:
            gamma:      0.99
            strength:   0.01
            network_settings:
              normalize:        False
              hidden_units:     128
              num_layers:       2
              vis_encode_type:  simple
              memory:   None
              goal_conditioning_type:   hyper
            learning_rate:      0.0003
            encoding_size:      None
            use_actions:        False
            use_vail:   False
            demo_path:  Demos/ZARAdemos/
        init_path:      None
        keep_checkpoints:       5
        checkpoint_interval:    500000
        max_steps:      100000
        time_horizon:   64
        summary_freq:   60000
        threaded:       False
        self_play:      None
        behavioral_cloning:
          demo_path:    Demos/ZARAdemos/
          steps:        50000
          strength:     1.0
          samples_per_update:   0
          num_epoch:    None
          batch_size:   None
d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\nn\init.py:388: UserWarning: Initializing zero-element tensors is a no-op
  warnings.warn("Initializing zero-element tensors is a no-op")
d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\nn\init.py:426: UserWarning: Initializing zero-element tensors is a no-op
  warnings.warn("Initializing zero-element tensors is a no-op")
Traceback (most recent call last):
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\trainer_controller.py", line 176, in start_learning
    n_steps = self.advance(env_manager)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\trainer_controller.py", line 234, in advance
    new_step_infos = env_manager.get_steps()
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\env_manager.py", line 124, in get_steps
    new_step_infos = self._step()
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 298, in _step
    self._queue_steps()
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 291, in _queue_steps
    env_action_info = self._take_step(env_worker.previous_step)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 429, in _take_step
    all_action_info[brain_name] = self.policies[brain_name].get_action(
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\policy\torch_policy.py", line 212, in get_action
    run_out = self.evaluate(decision_requests, global_agent_ids)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\policy\torch_policy.py", line 178, in evaluate
    action, log_probs, entropy, memories = self.sample_actions(
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\policy\torch_policy.py", line 140, in sample_actions
    actions, log_probs, entropies, memories = self.actor.get_action_and_stats(
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\networks.py", line 626, in get_action_and_stats
    action, log_probs, entropies = self.action_model(encoding, masks)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\action_model.py", line 194, in forward
    actions = self._sample_action(dists)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\action_model.py", line 84, in _sample_action
    discrete_action.append(discrete_dist.sample())
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\distributions.py", line 114, in sample
    return torch.multinomial(self.probs, 1)
RuntimeError: cannot sample n_sample > prob_dist.size(-1) samples without replacement

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\antre\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\antre\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "D:\Desktop\Crowds-and-ML-Agents\venv\Scripts\mlagents-learn.exe\__main__.py", line 7, in <module>
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\learn.py", line 250, in main
    run_cli(parse_command_line())
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\learn.py", line 246, in run_cli
    run_training(run_seed, options)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\learn.py", line 125, in run_training
    tc.start_learning(env_manager)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\trainer_controller.py", line 201, in start_learning
    self._save_models()
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\trainer_controller.py", line 80, in _save_models
    self.trainers[brain_name].save_model()
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\trainer\rl_trainer.py", line 185, in save_model
    model_checkpoint = self._checkpoint()
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\trainer\rl_trainer.py", line 157, in _checkpoint
    export_path, auxillary_paths = self.model_saver.save_checkpoint(
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\model_saver\torch_model_saver.py", line 59, in save_checkpoint
    self.export(checkpoint_path, behavior_name)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\model_saver\torch_model_saver.py", line 64, in export
    self.exporter.export_policy_model(output_filepath)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\model_serialization.py", line 159, in export_policy_model
    torch.onnx.export(
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\onnx\__init__.py", line 275, in export
    return utils.export(model, args, f, export_params, verbose, training,
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\onnx\utils.py", line 88, in export
    _export(model, args, f, export_params, verbose, training, input_names, output_names,
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\onnx\utils.py", line 689, in _export
    _model_to_graph(model, args, verbose, input_names,
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\onnx\utils.py", line 458, in _model_to_graph
    graph, params, torch_out, module = _create_jit_graph(model, args,
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\onnx\utils.py", line 422, in _create_jit_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\onnx\utils.py", line 373, in _trace_and_get_graph_from_model
    torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\jit\_trace.py", line 1160, in _get_trace_graph
    outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\jit\_trace.py", line 127, in forward
    graph, out = torch._C._create_graph_by_tracing(
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\jit\_trace.py", line 118, in wrapper
    outs.append(self.inner(*trace_inputs))
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\torch\nn\modules\module.py", line 1039, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\networks.py", line 664, in forward
    ) = self.action_model.get_action_out(encoding, masks)
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\action_model.py", line 171, in get_action_out
    discrete_out_list = [
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\action_model.py", line 172, in <listcomp>
    discrete_dist.exported_model_output()
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\distributions.py", line 136, in exported_model_output
    return self.sample()
  File "d:\desktop\crowds-and-ml-agents\venv\lib\site-packages\mlagents\trainers\torch\distributions.py", line 114, in sample
    return torch.multinomial(self.probs, 1)
RuntimeError: cannot sample n_sample > prob_dist.size(-1) samples without replacement

I’ll flag with the team for some guidance. Which version of ML Agents are you using?

Hello Trey. Finally I found out the problem.
Although I do not using Discrete actions, i had to set the Branch 0 Size to 1 instead of 0.

Thanks.

1 Like