I’m having an issue where, no matter what I do, MLAgents seems to crash at 500,000 steps. I have tried messing with the yaml file but haven’t had any luck.
While it crashes, it correctly outputs the NN file at 500,000 steps.
mlagents-learn config.yaml --env=build/game --num-envs=6 --no-graphics --run-id=HunterAug1d
max_steps: 1.0e6 #also tried 1e6
020-08-02 03:24:53 INFO [stats.py:101] PreyAgent: Step: 50000. Time Elapsed: 84.336 s Mean Reward: -10.210. Std of Reward: 4.147. Training.
2020-08-02 03:24:53 INFO [stats.py:101] HunterAgent: Step: 50000. Time Elapsed: 84.370 s Mean Reward: 10.340. Std of Reward: 4.045. Training.
2020-08-02 03:26:10 INFO [stats.py:101] HunterAgent: Step: 100000. Time Elapsed: 160.819 s Mean Reward: 9.749. Std of Reward: 4.078. Training.
2020-08-02 03:26:10 INFO [stats.py:101] PreyAgent: Step: 100000. Time Elapsed: 160.861 s Mean Reward: -9.663. Std of Reward: 4.145. Training.
2020-08-02 03:27:25 INFO [stats.py:101] HunterAgent: Step: 150000. Time Elapsed: 235.878 s Mean Reward: 8.233. Std of Reward: 4.155. Training.
2020-08-02 03:27:25 INFO [stats.py:101] PreyAgent: Step: 150000. Time Elapsed: 235.914 s Mean Reward: -8.089. Std of Reward: 4.243. Training.
2020-08-02 03:28:40 INFO [stats.py:101] HunterAgent: Step: 200000. Time Elapsed: 311.086 s Mean Reward: 8.099. Std of Reward: 4.021. Training.
2020-08-02 03:28:40 INFO [stats.py:101] PreyAgent: Step: 200000. Time Elapsed: 311.129 s Mean Reward: -7.890. Std of Reward: 4.038. Training.
2020-08-02 03:29:56 INFO [stats.py:101] PreyAgent: Step: 250000. Time Elapsed: 387.289 s Mean Reward: -8.274. Std of Reward: 4.451. Training.
2020-08-02 03:29:56 INFO [stats.py:101] HunterAgent: Step: 250000. Time Elapsed: 387.321 s Mean Reward: 8.553. Std of Reward: 4.436. Training.
2020-08-02 03:31:11 INFO [stats.py:101] PreyAgent: Step: 300000. Time Elapsed: 462.680 s Mean Reward: -7.454. Std of Reward: 4.075. Training.
2020-08-02 03:31:12 INFO [stats.py:101] HunterAgent: Step: 300000. Time Elapsed: 462.714 s Mean Reward: 7.712. Std of Reward: 4.039. Training.
2020-08-02 03:32:26 INFO [stats.py:101] PreyAgent: Step: 350000. Time Elapsed: 537.420 s Mean Reward: -7.014. Std of Reward: 3.716. Training.
2020-08-02 03:32:26 INFO [stats.py:101] HunterAgent: Step: 350000. Time Elapsed: 537.452 s Mean Reward: 7.257. Std of Reward: 3.694. Training.
2020-08-02 03:33:42 INFO [stats.py:101] PreyAgent: Step: 400000. Time Elapsed: 613.628 s Mean Reward: -7.498. Std of Reward: 4.707. Training.
2020-08-02 03:33:42 INFO [stats.py:101] HunterAgent: Step: 400000. Time Elapsed: 613.662 s Mean Reward: 7.670. Std of Reward: 4.682. Training.
2020-08-02 03:34:58 INFO [stats.py:101] PreyAgent: Step: 450000. Time Elapsed: 689.006 s Mean Reward: -8.772. Std of Reward: 5.101. Training.
2020-08-02 03:34:58 INFO [stats.py:101] HunterAgent: Step: 450000. Time Elapsed: 689.041 s Mean Reward: 8.801. Std of Reward: 5.022. Training.
2020-08-02 03:36:14 INFO [stats.py:101] PreyAgent: Step: 500000. Time Elapsed: 765.620 s Mean Reward: -9.187. Std of Reward: 5.112. Training.
2020-08-02 03:36:14 INFO [rl_trainer.py:151] Checkpointing model for PreyAgent.
2020-08-02 03:36:14 INFO [stats.py:101] HunterAgent: Step: 500000. Time Elapsed: 765.670 s Mean Reward: 9.222. Std of Reward: 5.031. Training.
2020-08-02 03:36:14 INFO [rl_trainer.py:151] Checkpointing model for HunterAgent.
2020-08-02 03:36:19 INFO [trainer_controller.py:76] Saved Model
2020-08-02 03:36:19 INFO [model_serialization.py:203] List of nodes to export for brain :PreyAgent
2020-08-02 03:36:19 INFO [model_serialization.py:205] is_continuous_control
2020-08-02 03:36:19 INFO [model_serialization.py:205] trainer_major_version
2020-08-02 03:36:19 INFO [model_serialization.py:205] trainer_minor_version
2020-08-02 03:36:19 INFO [model_serialization.py:205] trainer_patch_version
2020-08-02 03:36:19 INFO [model_serialization.py:205] version_number
2020-08-02 03:36:19 INFO [model_serialization.py:205] memory_size
2020-08-02 03:36:19 INFO [model_serialization.py:205] action_output_shape
2020-08-02 03:36:19 INFO [model_serialization.py:205] action
Converting results\HunterAug1d\PreyAgent/frozen_graph_def.pb to results\HunterAug1d\PreyAgent.nn
GLOBALS: 'is_continuous_control', 'trainer_major_version', 'trainer_minor_version', 'trainer_patch_version', 'version_number', 'memory_size', 'action_output_shape'
IN: 'vector_observation': [-1, 1, 1, 741] => 'policy/main_graph_0/hidden_0/BiasAdd'
IN: 'action_masks': [-1, 1, 1, 6] => 'policy_1/strided_slice'
IN: 'action_masks': [-1, 1, 1, 6] => 'policy_1/strided_slice_1'
OUT: 'action'
DONE: wrote results\HunterAug1d\PreyAgent.nn file.
2020-08-02 03:36:20 INFO [model_serialization.py:83] Exported results\HunterAug1d\PreyAgent.nn file
2020-08-02 03:36:20 INFO [model_serialization.py:203] List of nodes to export for brain :HunterAgent
2020-08-02 03:36:20 INFO [model_serialization.py:205] is_continuous_control
2020-08-02 03:36:20 INFO [model_serialization.py:205] trainer_major_version
2020-08-02 03:36:20 INFO [model_serialization.py:205] trainer_minor_version
2020-08-02 03:36:20 INFO [model_serialization.py:205] trainer_patch_version
2020-08-02 03:36:20 INFO [model_serialization.py:205] version_number
2020-08-02 03:36:20 INFO [model_serialization.py:205] memory_size
2020-08-02 03:36:20 INFO [model_serialization.py:205] action_output_shape
2020-08-02 03:36:20 INFO [model_serialization.py:205] action
Converting results\HunterAug1d\HunterAgent/frozen_graph_def.pb to results\HunterAug1d\HunterAgent.nn
GLOBALS: 'is_continuous_control', 'trainer_major_version', 'trainer_minor_version', 'trainer_patch_version', 'version_number', 'memory_size', 'action_output_shape'
IN: 'vector_observation': [-1, 1, 1, 741] => 'policy/main_graph_0/hidden_0/BiasAdd'
IN: 'action_masks': [-1, 1, 1, 6] => 'policy_1/strided_slice'
IN: 'action_masks': [-1, 1, 1, 6] => 'policy_1/strided_slice_1'
OUT: 'action'
DONE: wrote results\HunterAug1d\HunterAgent.nn file.
2020-08-02 03:36:20 INFO [model_serialization.py:83] Exported results\HunterAug1d\HunterAgent.nn file
2020-08-02 03:36:20 INFO [environment.py:418] Environment shut down with return code 0 (CTRL_C_EVENT).
2020-08-02 03:36:21 INFO [environment.py:418] Environment shut down with return code 0 (CTRL_C_EVENT).
2020-08-02 03:36:21 INFO [environment.py:418] Environment shut down with return code 0 (CTRL_C_EVENT).
2020-08-02 03:36:22 INFO [environment.py:418] Environment shut down with return code 0 (CTRL_C_EVENT).
2020-08-02 03:36:22 INFO [environment.py:418] Environment shut down with return code 0 (CTRL_C_EVENT).
2020-08-02 03:36:22 INFO [environment.py:418] Environment shut down with return code 0 (CTRL_C_EVENT).
2020-08-02 03:36:23 INFO [environment.py:418] Environment shut down with return code 0 (CTRL_C_EVENT).
2020-08-02 03:36:23 INFO [environment.py:418] Environment shut down with return code 0 (CTRL_C_EVENT).
2020-08-02 03:36:24 INFO [environment.py:418] Environment shut down with return code 0 (CTRL_C_EVENT).
Traceback (most recent call last):
File "C:\Program Files\Python38\lib\multiprocessing\queues.py", line 241, in _feed
send_bytes(obj)
File "C:\Program Files\Python38\lib\multiprocessing\connection.py", line 200, in send_bytes
self._send_bytes(m[offset:offset + size])
File "C:\Program Files\Python38\lib\multiprocessing\connection.py", line 290, in _send_bytes
nwritten, err = ov.GetOverlappedResult(True)
BrokenPipeError: [WinError 109] The pipe has been ended
2020-08-02 03:36:24 INFO [environment.py:418] Environment shut down with return code 0 (CTRL_C_EVENT).
ML Agents Package: Release 4
Windows 10 Environment
Python 3.8.5
mlagents 0.18.0
mlagents-envs 0.18.0
tensorboard 2.3.0
tensorboard-plugin-wit 1.7.0
tensorflow 2.3.0
tensorflow-estimator 2.3.0
numpy 1.18.5