I am a beginner to Unity so I followed this guide and trained the builtin 3dball example on my system and then moved to creating new environment.
Then I created a custom unity environment of a snake game following the guide given here. All actions, observations etc were defined and manual input is also working fine using heuristics as explained here. The game does not require any human intervention as well as seen in video here.
Lastly when I run the python shell command for training, it says press the play button (As usual). After pressing play button in unity, it says connected (screenshot attached). And then the game starts to take actions (at a very fast pace as compared to normal play rate) but terminal does not show anything. After some time terminal gives error that environment took too long to respond.
I have searched and read many articles but could not get a workaround. Also I have made a video to make things clearer.
The configuration file is same as defined in this link (except the name in first row).
The necessary screenshots, videos, logs or any other requested material can be accessed here.
My versions:
ml-agents: 0.16.0, (installed via pip)
ml-agents-envs: 0.16.0,
Communicator API: 1.0.0,
ml-agents used in unity package: 1.0.0
TensorFlow: 2.0.1
unity: 2019.3.12f1
Hi @mshajeehm ,
Would you be willing to share your project for us to test?
Could you also share your editor logs?
Yes for sure. (by share you mean collaborate? or upload whole project) Please send me your unity id/email so that I may share it with you.
Also I have created a project folder on onedrive with all necessary information (logs, Project Files, screenshots, video) which can accessed here.
one more clarity is required.
My observation space is variable that is it is the position of snake head and all its body parts. So as the snake eats food, its body parts increase and so the observation space.
What are the best practices to address this issue?
I also tried giving list to sensor.AddObservation method but it gave error so currentlry I am iterating through whole list and adding observations. For the inspector window, I have set observations to 1000 (max). It keeps giving warning of observations being padded due to low count given. My code:
foreach (Vector2Int position in snake.instance.getFullSnakeGridPositionList())
{
sensor.AddObservation(position);
}
Hi @mshajeehm ,
- Is your agent in heuristic mode?
- It’s not really supported to have a variable observation space. You could do something to observe the whole grid and flag cells in it that are filled by the snake sections. This would allow you to have a variable snake length while having a static observation space.
thanks for the fixed observation state idea. I’ll implement it.
Yes it is in heuristic only mode. I also run it in default mode and it connected with the brain but gave the following error: File 5.jpg
Ok, that seems to be why it’s timing out. We will fix this issue, but you should know you are not actually training your agent while it’s in heuristic mode. You are controlling it via your heuristic function. You need to set the Behavior Type to default in the inspector of your agent in order for it to train.
Well that is a NEWS to me ;).
Running in default mode gave this error

Edit: The error has been removed by giving sequence_length in config file. Thanks @christophergoy
Now I am going to train it.
it’ll be very kind of you if you please provide any more tips by looking at my reward function and hyperparameters at this link (Assets > Scripts > snakeAgent.cs)
@christophergoy
I trained my model for 50,000 steps are results are as given below. By looking at random mean reward at each iteration I presume, my Reward function / hyperparameters are not set up correctly
Hi @mshajeehm ,
First, you normally want to have the simplest reward function possible. In this case you may want to give the agent a reward for finding the apple and that’s it. I may take a while to train (much more than 50k steps). I would recommend simplifying your reward function and letting it train for a longer period of time.
Hi @christophergoy
Need one more advice please
When an episode ends, I am reloading scene using unity scene management in episode.begin method. But it gives common error of accessing instance of object that does not exist.
I searched Web and came to know that unity ml agent has this issue regarding academy steps i.e. Reloading the scene destroys the game object and so error comes.
I also tried writing function to reset all variables but that too is very messy since there are multiple classes involved with multiple instances. Then I tried async reloading (knowing it won’t work) and it didn’t. I only knew this that my 50k iterations completed last night using same reloading scene logic and it didn’t give error(screenshot above). Probably snake never died in those 50k iterations.
And when I started working this morning, the error popped up after every episode end no matter what I do.
So what are the options for me in order to create a smooth start after episode ends? Is there any way I can reload scene without getting academy errors
The scene reload issue should be fixed in Release 1 of ml-agents. Please try there and let me know if you’re still having issues.
@christophergoy could you look into UnityGymWrapper Crash after 2M iterations - Unity Forum and let me know if it is related or not? I am a bit between a rock and a hard place with it.


