I am implementing my own custom episode runnner through the python API. Previously, the way I was communicating the gameObject positions and velocities between Unity and Python was to add them as observations through CollectObservations in Unity. A.k.a, I’m adding two vectors to the observations. This is not ideal because the structure of these two vectors gets lost when using env.get_steps() in python. For example, if a gameObject’s position is the vector <1, 0, 1> and its velocity is <5, 4, 3>, the observations will be reduced to the list [1, 0, 1, 5, 4, 3], essentially concatenating the two vectors together. Therefore, I would need to know the dimensions of the two individual observations and the order they are added to the observations in order to make use of them. This is very limiting and is prone to error.
To get around this, my initial idea was to create a custom side channel that encodes every gameObject state as a string and then parse that in python. But the issue is then how will I know which gameObject instance in unity corresponds to which behavior name in the python code? Is there a unique identifier that I can use to match a gameObject in Unity to a behavior name in python?
Hi,
You are right, we usually just use the dimensions of individual observations and the order in which they were added. I am curious what your use case it and why you would need to decompose the observation back into “position” and “velocity”. There is an id for each agent episode that we use, but it is not a public property, so you will not be able to access it without modifying the source code or use reflexion.
You could set up the custom side channel and send (in addition to the observation data), a unique id and also send that id as the only observation of the Agent (this way you would be able to like a custom side channel message to an agent).
Honestly, that seems like a lot of work for little payoff. I think if would be a lot easier and less prone to error to have a python helper function that “splits” the observation into appropriate fields based off the vector observation received. If/when the CollectObservations method is changed, you could change this helper method in tandem.
Is there a reason it is not reasonable to split the observation back from the Python side ?
The reasoning is mostly just that the default way seems prone to errors. As you said, I could just make sure to change both the Unity methods and the Python methods, but that’s very dependent on me remembering to do that and I think it is generally better practice to remove such dependencies. It just seems like bad practice to hard code things like that.
It actually wasn’t much work. The side channel was pretty easy to implement. I ended up just passing the gameObject instance ID in the CollectObservations method and then have the side channel generate a string to represent a python dictionary in the form {gameObjectID: {Velocity: <x,y,x>, Position: <x,y,x>}}. I then cross reference the gameObject id from the CollectObservations method with the information sent through the side channel.
Out of curiosity, is the flattening of the observations for speed increases? It seems like an odd decision (software engineering wise) when its generally common practice to index fields by name in something like a dictionary.
It is for performance. Sending strings is very inefficient. Since in our use case, the data is meant to be consumed by a neural network that would just take as input a vector of unlabeled floats, we did not see a reason to label the observations in the first place.