What is wrong with my model

Hello. I train the model for the game (such as match3). For 4 million iterations, the size of the model (*.onnx) has not changed at all and is only about 60 kb. That is, after 100 iterations and after 4 million, the size does not change. If you look at the size of the model in the Match 3 example from the Unity, then it is 1.5 mb. In addition, my model is practically not trained. What could be the problem? I am attaching information about the software in the screenshot. Thanks in advance.

Hi
Why its not working could be any one of a million reasons. Suggest taking a look at the tensorboard stats to get some clues as to what might be wrong.

Re: “the size of the model (*.onnx) has not changed at all and is only about 60 kb”
This is how it is supposed to be. The model size does not change cause we are not adding anything to the model. The size of the brain in terms of number of its structure, number of weights, etc… will not change. The only thing that will change is the values of the weights and bias of each neuron. So the size of the OMNX file will NOT change over time.

Hi. Great thanks for the answer. Here is a screenshot of the stats. But what conclusions can be drawn from it? That the model really does not learn.

I’m assuming based on the first console screenshot that this is NOT a competitive game where two different agents are competing with each other.

The green run seems top be collapsing to determinism at around 400K runs. The policy loss suddenly drops to zero which means zero changes to the policy or in other words zero learning. This is in spite of having some loss in the value graph.

The blue run the agent brain quickly learns one solution to gain a reward of 4.6 and then learns nothing new after that. Both Policy loss and Value loss close to zero. Meaning it is learning nothing.

When your training a model. If the policy loss, or value loss suddenly drop to zero. Then stop the run as continuing is pointless. The model has collapsed.

If Value loss explodes to very high values then also stop the run as continuing is pointless. The gradients have probably exploded

The following graph is an example of gradient explosion starting from the curiosity module.

Machine learning models can collapse to determinism. They can get stuck in local minima / local maxima. Or we can experience other issues such as exploding gradients etc…

it is possible that your hyper parameters might be unsuitable for what your doing. There is no exact science for tuning hyper parameters other than experimentation. Buffer size and batch size are a good place to start

or:

Your simulation (Unity side environment) could have some flaw in it: This could be:

  1. Game logic flaws
  2. incorrect setup on sensors
  3. game bugs
  4. flawed reward system

Reinforcement learning is a complex topic. Unity ML Agents makes the technology a lot more accessible. However the underlying complexity still remains.

Consider this. On the project that I am working on right now. If I change the shape of the dense reward system (values of the rewards) to be equally scaled the model will not train at all. It can run for 5 million steps without collapsing or exploding yet learn absolutely nothing. Instead if I change the rescale the three rewards to max values of 0.50, 0.24 and 0.11 then the model trains very well and learns decent performance by 400K steps.

The Unity ML Agents documentation recommends to start by running the example projects. Then next making small changes to them.

As for myself I’m in the process of migrating from custom built AI to ML Agents.
On that topic I’m writing a series on medium.com. on my own progress in training something more complex than the examples which come with ML Agents. I started it as a forum post on here but it became way too long for a forum post. This may or may not help you in terms of giving you some ideas from my own failures.

Albeit the simulation I am working on is both asymmetric, competitive and does not use the physics system for movement which makes it quite hard to achieve stable training. However I come from an AI / ML background plus a Unity game dev background and yet It has me taken two months of experimentation with ML agents over more than 50 training runs to get to a working solution. Which I will document in Part 4 (as it is still training right now).

If your interested you can find the links to these documents on my Unity forums profile page.

All the best :slight_smile:

1 Like

ChillX,
Thanks a lot for such a detailed answer. I will definitely read all parts of your articles.
Maybe after that I will have thoughts on how to reconfigure the model.
Cheers.