invalid API version number and recv failed

Hi, first i'm sorry for my poor english.
I tried to follow the micro-game kart tutorial to learn how Ml-agents works; first i tried with tensorflow cpu but my intel proc does'nt support avx/avx2 , so i followed a tutorial to install tensorflow gpu.

I use anaconda in a virtual environment ,i'm using python version 3.7.9 and tensorflow-gpuv2.1.0 ; ml-agents v0.19.0

Steps i did in Unity: i downladed the micro-game kart project ; opened the ml agent training scene

Steps i did in python: with anaconda3

-> create a virtual environment (python 3.7.9)
->install in that environment tensorflow-gpu, after installing cudatoolkit & cudnn ,
->install mlagents from the latestbranch ( with pip install -e . in each directory , mlagents-enves and mlagents
->run the ml agnts learn command from the kart microgame folder

i runned first the python command and then pressed play . I got the error below .
thanks in advance for your help .

2020-09-17 15:22:32.529850: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
WARNING:tensorflow:From C:\Users\-\.conda\envs\chameleon\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:88: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term


                        β–„β–„β–„β–“β–“β–“β–“
                   β•“β–“β–“β–“β–“β–“β–“β–ˆβ–“β–“β–“β–“β–“
              ,β–„β–„β–„mβ–€β–€β–€'  ,β–“β–“β–“β–€β–“β–“β–„                           β–“β–“β–“  β–“β–“β–Œ
            β–„β–“β–“β–“β–€'      β–„β–“β–“β–€  β–“β–“β–“      β–„β–„     β–„β–„ ,β–„β–„ β–„β–„β–„β–„   ,β–„β–„ β–„β–“β–“β–Œβ–„ β–„β–„β–„    ,β–„β–„
          β–„β–“β–“β–“β–€        β–„β–“β–“β–€   β–β–“β–“β–Œ     β–“β–“β–Œ   ▐▓▓ β–β–“β–“β–“β–€β–€β–€β–“β–“β–Œ β–“β–“β–“ β–€β–“β–“β–Œβ–€ ^β–“β–“β–Œ  β•’β–“β–“β–Œ
        β–„β–“β–“β–“β–“β–“β–„β–„β–„β–„β–„β–„β–„β–„β–“β–“β–“      β–“β–€      β–“β–“β–Œ   ▐▓▓ ▐▓▓    β–“β–“β–“ β–“β–“β–“  β–“β–“β–Œ   ▐▓▓▄ β–“β–“β–Œ
        β–€β–“β–“β–“β–“β–€β–€β–€β–€β–€β–€β–€β–€β–€β–€β–“β–“β–„     β–“β–“      β–“β–“β–Œ   ▐▓▓ ▐▓▓    β–“β–“β–“ β–“β–“β–“  β–“β–“β–Œ    ▐▓▓▐▓▓
          ^β–ˆβ–“β–“β–“        β–€β–“β–“β–„   β–β–“β–“β–Œ     β–“β–“β–“β–“β–„β–“β–“β–“β–“ ▐▓▓    β–“β–“β–“ β–“β–“β–“  β–“β–“β–“β–„    β–“β–“β–“β–“`
            'β–€β–“β–“β–“β–„      ^β–“β–“β–“  β–“β–“β–“       β””β–€β–€β–€β–€ β–€β–€ ^β–€β–€    `β–€β–€ `β–€β–€   'β–€β–€    β–β–“β–“β–Œ
               β–€β–€β–€β–€β–“β–„β–„β–„   β–“β–“β–“β–“β–“β–“,                                      β–“β–“β–“β–“β–€
                   `β–€β–ˆβ–“β–“β–“β–“β–“β–“β–“β–“β–“β–Œ
                        Β¬`β–€β–€β–€β–ˆβ–“


Version information:
  ml-agents: 0.19.0,
  ml-agents-envs: 0.19.0,
  Communicator API: 1.0.0,
  TensorFlow: 2.1.0
2020-09-17 15:22:38 WARNING [learn.py:256] The --train option has been deprecated. Train mode is now the default. Use --inference to run in inference mode.
2020-09-17 15:22:38 INFO [learn.py:271] run_seed set to 2703
2020-09-17 15:22:39.414271: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
WARNING:tensorflow:From C:\Users\-\.conda\envs\chameleon\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:88: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
2020-09-17 15:22:41 INFO [environment.py:199] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
Process Process-1:
Traceback (most recent call last):
  File "C:\Users\-\.conda\envs\chameleon\lib\multiprocessing\process.py", line 297, in _bootstrap
    self.run()
  File "C:\Users\-\.conda\envs\chameleon\lib\multiprocessing\process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "h:\tutos\ml-agents-latest_release\ml-agents\mlagents\trainers\subprocess_env_manager.py", line 139, in worker
    worker_id, [env_parameters, engine_configuration_channel, stats_channel]
  File "h:\tutos\ml-agents-latest_release\ml-agents\mlagents\trainers\learn.py", line 207, in create_unity_environment
    log_folder=log_folder,
  File "h:\tutos\ml-agents-latest_release\ml-agents-envs\mlagents_envs\environment.py", line 220, in __init__
    aca_params.package_version,
  File "h:\tutos\ml-agents-latest_release\ml-agents-envs\mlagents_envs\environment.py", line 85, in _check_communication_compatibility
    unity_communicator_version = StrictVersion(unity_com_ver)
  File "C:\Users\-\.conda\envs\chameleon\lib\distutils\version.py", line 40, in __init__
    self.parse(vstring)
  File "C:\Users\-\.conda\envs\chameleon\lib\distutils\version.py", line 137, in parse
    raise ValueError("invalid version number '%s'" % vstring)
ValueError: invalid version number 'API-13'
2020-09-17 15:22:57 INFO [trainer_controller.py:192] Learning was interrupted. Please wait while the graph is generated.
2020-09-17 15:22:57 INFO [trainer_controller.py:76] Saved Model
Traceback (most recent call last):
  File "C:\Users\-\.conda\envs\chameleon\lib\multiprocessing\connection.py", line 312, in _recv_bytes
    nread, err = ov.GetOverlappedResult(True)
BrokenPipeError: [WinError 109] Le canal de communication a Γ©tΓ© fermΓ©

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "h:\tutos\ml-agents-latest_release\ml-agents\mlagents\trainers\subprocess_env_manager.py", line 88, in recv
    response: EnvironmentResponse = self.conn.recv()
  File "C:\Users\-\.conda\envs\chameleon\lib\multiprocessing\connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "C:\Users\-\.conda\envs\chameleon\lib\multiprocessing\connection.py", line 321, in _recv_bytes
    raise EOFError
EOFError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\-\.conda\envs\chameleon\Scripts\mlagents-learn-script.py", line 33, in <module>
    sys.exit(load_entry_point('mlagents', 'console_scripts', 'mlagents-learn')())
  File "h:\tutos\ml-agents-latest_release\ml-agents\mlagents\trainers\learn.py", line 276, in main
    run_cli(parse_command_line())
  File "h:\tutos\ml-agents-latest_release\ml-agents\mlagents\trainers\learn.py", line 272, in run_cli
    run_training(run_seed, options)
  File "h:\tutos\ml-agents-latest_release\ml-agents\mlagents\trainers\learn.py", line 149, in run_training
    tc.start_learning(env_manager)
  File "h:\tutos\ml-agents-latest_release\ml-agents-envs\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "h:\tutos\ml-agents-latest_release\ml-agents\mlagents\trainers\trainer_controller.py", line 201, in start_learning
    raise ex
  File "h:\tutos\ml-agents-latest_release\ml-agents\mlagents\trainers\trainer_controller.py", line 177, in start_learning
    self._reset_env(env_manager)
  File "h:\tutos\ml-agents-latest_release\ml-agents-envs\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "h:\tutos\ml-agents-latest_release\ml-agents\mlagents\trainers\trainer_controller.py", line 113, in _reset_env
    env_manager.reset(config=new_config)
  File "h:\tutos\ml-agents-latest_release\ml-agents\mlagents\trainers\env_manager.py", line 66, in reset
    self.first_step_infos = self._reset_env(config)
  File "h:\tutos\ml-agents-latest_release\ml-agents\mlagents\trainers\subprocess_env_manager.py", line 290, in _reset_env
    ew.previous_step = EnvironmentStep(ew.recv().payload, ew.worker_id, {}, {})
  File "h:\tutos\ml-agents-latest_release\ml-agents\mlagents\trainers\subprocess_env_manager.py", line 94, in recv
    raise UnityCommunicationException("UnityEnvironment worker: recv failed.")
mlagents_envs.exception.UnityCommunicationException: UnityEnvironment worker: recv failed.

Hi again,
I've tried to test the ml-agents examples( 3DBall test ) with a new conda virtual env with ml-agents 0.19.0 , ml-agents-envs 0.19.0 tensorflow2.1.0 and tensorflow-gpu 2.3.0 and IT WORKED !!!
I could make a training session, get the nn file and inference then.

But when i came back with the same env for the kart project , same error :

 Version information:
  ml-agents: 0.19.0,
  ml-agents-envs: 0.19.0,
  Communicator API: 1.0.0,
  TensorFlow: 2.1.0
2020-09-18 15:23:11 INFO [learn.py:271] run_seed set to 3872
2020-09-18 15:23:11.999721: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
WARNING:tensorflow:From c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\tensorflow_core\python\compat\v2_compat.py:88: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
2020-09-18 15:23:14 INFO [environment.py:199] Listening on port 5004. Start training by pressing the Play button in the Unity Editor.
Process Process-1:
Traceback (most recent call last):
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\multiprocessing\process.py", line 297, in _bootstrap
    self.run()
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\multiprocessing\process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 139, in worker
    worker_id, [env_parameters, engine_configuration_channel, stats_channel]
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents\trainers\learn.py", line 207, in create_unity_environment
    log_folder=log_folder,
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents_envs\environment.py", line 220, in __init__
    aca_params.package_version,
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents_envs\environment.py", line 85, in _check_communication_compatibility
    unity_communicator_version = StrictVersion(unity_com_ver)
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\distutils\version.py", line 40, in __init__
    self.parse(vstring)
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\distutils\version.py", line 137, in parse
    raise ValueError("invalid version number '%s'" % vstring)
ValueError: invalid version number 'API-13'
2020-09-18 15:23:25 INFO [trainer_controller.py:192] Learning was interrupted. Please wait while the graph is generated.
2020-09-18 15:23:25 INFO [trainer_controller.py:76] Saved Model
Traceback (most recent call last):
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\multiprocessing\connection.py", line 312, in _recv_bytes
    nread, err = ov.GetOverlappedResult(True)
BrokenPipeError: [WinError 109] Le canal de communication a Γ©tΓ© fermΓ©

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 88, in recv
    response: EnvironmentResponse = self.conn.recv()
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\multiprocessing\connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\multiprocessing\connection.py", line 321, in _recv_bytes
    raise EOFError
EOFError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\-\.conda\envs\MLAgent_training_env\Scripts\mlagents-learn.exe\__main__.py", line 7, in <module>
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents\trainers\learn.py", line 276, in main
    run_cli(parse_command_line())
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents\trainers\learn.py", line 272, in run_cli
    run_training(run_seed, options)
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents\trainers\learn.py", line 149, in run_training
    tc.start_learning(env_manager)
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents\trainers\trainer_controller.py", line 201, in start_learning
    raise ex
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents\trainers\trainer_controller.py", line 177, in start_learning
    self._reset_env(env_manager)
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents_envs\timers.py", line 305, in wrapped
    return func(*args, **kwargs)
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents\trainers\trainer_controller.py", line 113, in _reset_env
    env_manager.reset(config=new_config)
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents\trainers\env_manager.py", line 66, in reset
    self.first_step_infos = self._reset_env(config)
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 290, in _reset_env
    ew.previous_step = EnvironmentStep(ew.recv().payload, ew.worker_id, {}, {})
  File "c:\users\-\.conda\envs\mlagent_training_env\lib\site-packages\mlagents\trainers\subprocess_env_manager.py", line 94, in recv
    raise UnityCommunicationException("UnityEnvironment worker: recv failed.")
mlagents_envs.exception.UnityCommunicationException: UnityEnvironment worker: recv failed.

I saw in different threads that recv failed could be an issue with http_ proxy but i don't have any proxy , do i have to check something inside Unity ?
Why could i have an invalid version number with kart project and not with 3DBall project ? Do i have to downgrade the ml-agent within unity package manager?
I'm sorry i'm a total beginner and sorry for my grammatical mistakes if i made ;) !
And of course thanks in advance for your answers

And also i got this error in unity console :

Couldn't connect to trainer on port 5004 using API version API-13. Will perform inference instead.
UnityEngine.Debug:Log(Object)
MLAgents.Academy:InitializeEnvironment() (at Assets/Karting/ML_Agents/Scripts/Academy.cs:228)
MLAgents.Academy:LazyInitialization() (at Assets/Karting/ML_Agents/Scripts/Academy.cs:147)
MLAgents.Agent:OnEnable() (at Assets/Karting/ML_Agents/Scripts/Agent.cs:255)

i guess it's because python API is 14 and unity mlagents is 13 but if i upgrade the mlagents it doesn't work

Having similar issues as you; both the API-13 and recv failed errors. I'm on TF 2.3, Python 3.7.4 and Unity 19.4..Same agents version. Like you, training works fine in the ml agents examples (3D Ball, Roller Ball), but not in the Microgame. The main difference is that the kart game doesn't use the ml agents package and its dependencies. Playing around with that breaks a whole lot of stuff.

1 Like

OK , problem solved : after a couple of days trying to figure it out , i solved it doing this :
if you are using tensorflow gpu and were in this case, you have to do the following in this order :

-create a new environment with python 3.7.9
-install mlagents0.13.1 with development method i mean :
- download or clone the repo from github release ( be careful and take version 0.13 cause the latest release doesn't match the API )
- in your conda prompt access the folder of the downloaded release :( cd + the path to your folder )
-type "cd .\ml-agents-env " then type "pip install -e ."
-type "cd .."
-type "cd .\ml-agents" then type again "pip install -e ."

-install tensorflow-gpu with the command line : "conda install -c conda-forge tensorflow-gpu"

-type " conda remove tensorflow"
-type " conda install tensorflow-estimator=2.1.0"
-type " conda install tensorflow-gpu=2.1.0"

I've done alle these steps in that order and now i can train and inference my karts as it's explained in the tutorial ; nevertheless i regret that the tutorial isn't updated for the latest release and that we can't find good explanations for training with a gpu tensorflow ( even less in french, which is my native language)

By the way, thank you for all the courses

2 Likes