Hello! I have recently begun conducting tests on Unity ML-Agents
After conducting some simple tests and finding them very stubborn to learning in the environments I put them in, I tried to create the simplest environment I could possibly find. I call him yesbot, and he should always answer yes. [Code Below]. Yesbot receives a value, 1 or 0, and outputs a value, 1 or 0. if his output is the same as the input, he wins a +1 reward. If it is not, he loses -1 reward.
Despite how simple this environment is, I noted that after 100,000 steps (as recorded in Windows Terminal) the bot was a steady 60% accuracy rate. I ran the calculations from my Debug.Log(“right or wrong”) several times while running the experiment and found that after about 20,000 steps it was 60% accurate to the second decimal. (60.0num). It never progressed above 60%. While watching my visual interpretation in the scene view, I noted that the agents were 100% correct every third decision. (over 36 agents) And then on the other two decisions, it was ~50% right. This behavior became incredibly consistent, and explains the 60% average accuracy. Even when I tried to use a trained model and use inference, this problem persisted.
I am using a single Discrete branch, with 2 possibilities. My behavior type is set to default. I do have a YAML file I am using. I have only changed the behavior name and the max steps.
I am very new and understand that I may be overlooking simple concepts. If you dont have an answer, I would appreciate if you would direct me in the direction of documentation that might have one.
-Rylan Yancey, Amateur dev
using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using Unity.MLAgents;
using Unity.MLAgents.Sensors;
using Unity.MLAgents.Actuators;
public class YesBot : Agent
{
int yes;
[SerializeField] Material Win;
[SerializeField] Material Lose;
public override void OnEpisodeBegin()
{
yes = Random.Range(0, 2);
}
public override void CollectObservations(VectorSensor sensor)
{
sensor.AddObservation(yes);
}
public override void OnActionReceived(ActionBuffers actions)
{
MeshRenderer a;
a = GetComponent<MeshRenderer>();
if (actions.DiscreteActions[0] == yes)
{
AddReward(+5f);
Debug.Log("Right!");
a.material = Win;
}
else
{
AddReward(-5f);
Debug.Log("Wrong!");
a.material = Lose;
}
EndEpisode();
}
}
