Cannot make inference on neural network trained using ML-Agents

Hi,

I trained a neural network using ML-Agents, training was without any problems on PyTorch side, but trying to infer the trained network using Barracuda is giving me a whole list of errors.

My RL agent uses 2 Buffer Sensors (one to encode teammates info, one encode room features info), 1 Grid Sensor with resnet (to encode neighborhood spatial info of the agent), and a vector observation. The agent setup is in the attached file (“Agent Setup.jpg”).

Upon adding the .onnx file created by ML-Agents into the Editor, I see errors regarding “Cannot reshape array of size 453152 into shape with multiple of 15232 elements at Unity.Barracuda.TensorExtensions.Reshape. The full error stack is in *”*Error Stack_on onnx file added.jpg". Inspecting the onnx file I see the warning “model detected as NCHW, but not natively in this layout, behaviour might be erroneous”.

When trying to make inference I see another error “Off-axis dimensions must match”. The full error stack is in “Error Stack_on inference.jpg”

When using only Buffer Sensors or only Grid Sensor, inference has no problem. It is when both are used together that Barracuda seems to fail. I have uploaded my trained model in “NNModel Onnx File.zip”

I am using:
ML-Agents Release 18 (uses Barracuda 2.0.0)
PyTorch 1.7.1
Unity Version 2019.4.1f1
Windows 10 OS

Urgently need experts’ help on this.

Thank you.

7514669–926810–NNModel Onnx File.zip (1.69 MB)

Hi Cold85,

A bunch of good news about Agent3_2_472_r18_resnet_and_attn.onnx :

  • On Barracuda 2.0.0 and up NN import without any problem according to my test.
  • On MLAgent 2.0.0 (and thus Barracuda 2.0.0) import without any problem according to my test too.
  • On MLAgent 1.8.0 (and thus Barracuda 1.3.1) import fail as you describe above.
  • On bleeding edge Barracuda inference match reference ONNX runtime (appart from RandomNormalLike node more on this below) I expect this to be true since Barracuda 2.0.0

My guess is that you are using ML-Agent 1.8.0/Barracuda 1.3.1 thus and that the import bug was fixed along Barracuda 2.0.0 (itself used by ML-Agent 2.0.0)? Does it make senses and is it possible for you to give it a try with ML-Agent 2.0.0?

As a side note: ML-Agent 2.0.0 is a verified release while ML-Agent 1.8.0 is a preview package.

Final note: Barracuda can’t match RandomNormalLike for two reason: seed is not defined by model and is up to implementation + actual implementation of the random distribution is not standard and is up to inference library, however replacing RandomNormalLike by Identity made inference match.

Hope it helps!

Florent

@fguinier big thanks for looking into my problem!

Firstly to point out that ML-Agents 2.0.0’s dependency is Barracuda 1.4.0-preview instead (see git release page Releases · Unity-Technologies/ml-agents · GitHub).

I was using ML-Agents 2.1.0-exp.1/ Barracuda 2.0.0-pre.3 when I encountered the errors above. See versioning I screen captured from my package manager:

I also posted this question in ML-Agents forum ( Cannot make inference on neural network trained using ML-Agents ), to which @WaxyMcRivers replied saying Barracuda 2.1.0-preview seems to solve the import errors.

I updated my project’s Barracuda to 2.1.0-preview and got the same results as @WaxyMcRivers . So on my machine, at least, it was Barracuda 2.1.0-preview that resolved the errors.

Hi @Cold85 ,

Thanks for the info and followup!

According to Changelog | ML Agents | 2.1.0-exp.1

  • local test on package manager. Seems that we have:
    ML-Agent 2.0.0 → Barracuda 2.0.0 → model import fine
    ML-Agent 2.1.0-exp.1 → Barracuda 2.0.0-pre.3 → error on import
    ML-Agent 2.0.0-exp.1 → Barracuda 1.4.0-preview → error on import
    ML-Agent 2.0.0-pre.3 → Barracuda 2.0.0-pre.3 → error on import
    Also as you said Barracuda 2.1.0-preview → model import fine

So it seems that Barracuda 2.0.0 or 2.1.0-preview are the minimum version with the fix. Witch match the behavior you are seeing + is a good news as it means both official and latest version contain the fix.

However the documentation about the dependancies between ml-agents and barracuda seems indeed wrongly (as you poitned out Releases · Unity-Technologies/ml-agents · GitHub) I will raise with ML-Agent team.

Thanks again for feedback!
Florent

Hi,

To add the case that I had tested:

ML-Agent 2.1.0-exp.1 → Barracuda 2.1.0-preview → model import fine, model inferred fine in game mode