I am unsure if this is a bug or is intended. As I am setting up the environment slightly differently, I post it here in case that is what is causing the issue - it will require significant amount of work for me to set up my environment like how the demo did, as I am trying to integrate it with a game.
Environment:
Tensorflow Version : 2.0.1
Unity Version : 2018.4.9f1
ML Version : 0.13.1
The current problem I am having is that I am able to train my agent, however when I am trying to add demo files for imitation learning, the demo files doesn’t look qutie right, as shown in the attached snippet. All episodes, rewards and experience is 0.
However, as shown in the following reward graph, I am clearly setting up rewards correctly. I also tried making sure that my reward function is called and assigning rewards accordingly
I am suspecting the issue might be that I am destroying the agent whenever it is “done”, after the AgentOnDone
method is called. This is due to the current architecture of the software - destroying it completely once it is done make many other operation significantly easier.
Although I can rewrite my code to mimic what the demo does, I would be very grateful if I can avoid this: thus the question - is this intended? And will this cause any issue on my training using the demo files?