UnityGymWrapper Crash after 2M iterations

Hello, has anyone worked with the UnityGymWrapper and StableBaselines3, and gotten this error? Or any clue on what to check to debug it? I am a bit lost here. It is happening after 2M iterations so i dont think this is a bad compatibility issue.

-----------------------------------------
| rollout/                |             |
|    ep_len_mean          | 1e+03       |
|    ep_rew_mean          | -754        |
| time/                   |             |
|    fps                  | 51          |
|    iterations           | 799         |
|    time_elapsed         | 31563       |
|    total_timesteps      | 1636352     |
| train/                  |             |
|    approx_kl            | 0.045302175 |
|    clip_fraction        | 0.396       |
|    clip_range           | 0.2         |
|    entropy_loss         | -2.23       |
|    explained_variance   | 0.356       |
|    learning_rate        | 0.0003      |
|    loss                 | 7.15        |
|    n_updates            | 7980        |
|    policy_gradient_loss | 0.00526     |
|    std                  | 0.429       |
|    value_loss           | 8.91        |
-----------------------------------------
Process ForkServerProcess-2:
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/opt/conda/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ros/.local/lib/python3.7/site-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 29, in _worker
    observation, reward, done, info = env.step(data)
  File "/home/ros/.local/lib/python3.7/site-packages/stable_baselines3/common/monitor.py", line 90, in step
    observation, reward, done, info = self.env.step(action)
  File "/home/ros/.local/lib/python3.7/site-packages/gym/wrappers/order_enforcing.py", line 11, in step
    observation, reward, done, info = self.env.step(action)
  File "/app/environment_controller.py", line 200, in step
    s, r, d, info = self.env.step(action)
  File "/home/ros/.local/lib/python3.7/site-packages/gym_unity/envs/__init__.py", line 201, in step
    self._env.step()
  File "/home/ros/.local/lib/python3.7/site-packages/mlagents_envs/timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "/home/ros/.local/lib/python3.7/site-packages/mlagents_envs/environment.py", line 333, in step
    outputs = self._communicator.exchange(step_input, self._poll_process)
  File "/home/ros/.local/lib/python3.7/site-packages/mlagents_envs/rpc_communicator.py", line 137, in exchange
    self.poll_for_timeout(poll_callback)
  File "/home/ros/.local/lib/python3.7/site-packages/mlagents_envs/rpc_communicator.py", line 112, in poll_for_timeout
    "The Unity environment took too long to respond. Make sure that :\n"
mlagents_envs.exception.UnityTimeOutException: The Unity environment took too long to respond. Make sure that :
         The environment does not need user interaction to launch
         The Agents' Behavior Parameters > Behavior Type is set to "Default"
         The environment and the Python interface have compatible versions.
------------------------------------------------------------------------------
Total execution time: 31661.34 seconds
------------------------------------------------------------------------------
Total execution time: 31669.33 seconds
Traceback (most recent call last):
  File "/app/main.py", line 15, in <module>
    main()
  File "/app/main.py", line 9, in main
    m_Trainer.model_pipeline()
  File "/app/trainer.py", line 44, in _time_it
    return func(*args, **kwargs)
  File "/app/trainer.py", line 172, in model_pipeline
    _ = self._train_pipeline(model)
  File "/app/trainer.py", line 44, in _time_it
    return func(*args, **kwargs)
  File "/app/trainer.py", line 286, in _train_pipeline
    verbose=2,
  File "/home/ros/.local/lib/python3.7/site-packages/stable_baselines3/ppo/ppo.py", line 310, in learn
    reset_num_timesteps=reset_num_timesteps,
  File "/home/ros/.local/lib/python3.7/site-packages/stable_baselines3/common/on_policy_algorithm.py", line 237, in learn
    continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps)
  File "/home/ros/.local/lib/python3.7/site-packages/stable_baselines3/common/on_policy_algorithm.py", line 178, in collect_rollouts
    new_obs, rewards, dones, infos = env.step(clipped_actions)
  File "/home/ros/.local/lib/python3.7/site-packages/stable_baselines3/common/vec_env/base_vec_env.py", line 162, in step
    return self.step_wait()
  File "/home/ros/.local/lib/python3.7/site-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 120, in step_wait
    results = [remote.recv() for remote in self.remotes]
  File "/home/ros/.local/lib/python3.7/site-packages/stable_baselines3/common/vec_env/subproc_vec_env.py", line 120, in <listcomp>
    results = [remote.recv() for remote in self.remotes]
  File "/opt/conda/lib/python3.7/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/opt/conda/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/opt/conda/lib/python3.7/multiprocessing/connection.py", line 383, in _recv
    raise EOFError
EOFError

I verified my versions, I am running release 17 in my windows machine and mlagents 0.26 in the server where I am trying to train.

Unfortunately it crashes 66% of the time, as the image below shows 1 crashed after 1.5 M, the 2nd after 2.5-ish and the last one (run65) did finish indeed.

I guess until I figure out what the problem is, I will just recreate the environment data object every 1/3 of the total timesteps and see if this works.