Can I open in PyTorch a model trained on mlagents?

Hi, I want to train the standard PPO model but then I would need to perform additional testing in pytorch. By looking at the source code it seems that the framework make extensive use of pytorch. Also, it seems that together with the barracuda model, it saves some *.pt files which I guess are pytorch models. So I am inclined to think that it is indeed possible to reimport such models into python.

However, I am not sure where to find the definition of the architecture in order to load the state_dict into it. Any help? Thanks!

1 Like

Hey,

I was trying to accomplish the same thing… Based on my testing it is not possible to load the model using just the .pt files you would need to use the onnx files. If you are focused on importing it into pytorch it is possible using the caffe2 onnx backend. However, the easiest way for me was to use the onnxruntime library to load the model and run inference.

This isn’t a supported use-case currently, as the architecture is determined by the observations/actions you specify in Unity. Netron (https://netron.app/) is a great way to check on the ONNX file if you’re interested to see what the network looks like.

1 Like

Hi @ervteng_unity and @hmusr09 , thank you for your replies. I am pretty surprised about them, because I think I have actually managed to accomplish it.
I notice that to create the model structure I needed the behaviourSpec and the network setting, which as far as I understand are passed to Python from the Unity Scene through one of the communication channels.
So I started the learning mode (in debug mode through /learn.py) and I just saved both behaviourSpec and network settings in a file (pickled).
Then when I need to re-open the model outside Unity I do:

from mlagents.trainers.mltorch.networks import SimpleActor, SharedActorCritic, GlobalSteps
import pickle
import torch

behavior_spec = pickle.load(open('./beha_spec_visual_food.sp', 'rb'))
network_settings = pickle.load(open('./net_set_visual_food.sp', 'rb'))
condition_sigma_on_obs = False
tanh_squash = False
actor = SimpleActor(
            observation_specs=behavior_spec.observation_specs,
            network_settings=network_settings,
            action_spec=behavior_spec.action_spec,
            conditional_sigma=condition_sigma_on_obs,
            tanh_squash=tanh_squash,
            )

actor.load_state_dict(torch.load('./unity_projects/results/prova/VisualFoodCollector/VisualFoodCollector-2400.pt')['Policy'])

Which returns a promising “”

Is there anything wrong with this approach?

1 Like

Nope, this should work fine!

2 Likes

@ValBis I’m trying to load the model in pytorch as well, but I don’t quite understand how to save behavior and network information in pickle files. In which files and times should I save this information?

I would really like a small write up tutorial for this as well. I am trying to load a trained model on a real world scenario for deployment. Does the .onnx file store the network structure and the weights + biases? Can i just load the .onnx model like a torchscript for deployment?