Hello everyone, I was wondering if anyone has had this situation… I am trying to do some training with the MLAgents Unity package and I went through all of the guides. I set up a simple scenario in which I am requesting 3 values of Continuous type and no Discrete ones, as you can see here:
But for some reason that I don’t understand, instead of getting ‘random’ values between -1 and 1, I keep getting 0, 1, and -1, 99% of the time, from the very beginning the game starts. Here are my logs on the console (please note that the collapse option is enabled, so you can see the number of times each number is being repeated) and the snipped on the code I’m using for this simple test:
Yes, I am getting some random values on the first vector (0) of the ActionBuffers, however, it only happens sometimes and for my particular scenario, it doesn’t really help me as the agent would basically never get to the point where it can collect a positive reward.
I’ve been trying to figure this one out for hours and haven’t been able to figure out why… I did another test with a video I found on youtube, trying to get a cube to move toward a specific position, for this other test I needed 2 values, and for this one, every single time, I am indeed getting different ‘random’ values, as I should expect… However, I am not doing anything differently! So how is this case different from the other one? Just for reference, this is how my other test (where I was simply trying out the library) looks:
And as I mentioned before, this one does return values between -1 and 1, and therefore the agent can actually learn how to get to the reward.
It got to a point in which I even considered my PC was somehow faulty or damaged, so I formatted my whole system and did some maintenance on it, but it keeps giving me the same single 0, 1, and -1 values all the time. I also created a build and tried out that method, same exact results. I then tried the build on a different PC, same exact results!
I would highly appreciate anyone’s help! Honestly, at this point, I don’t really know what else to try… It just is not making any sense to me