Heuristic is called when in inference only

Hi there
I am new in Unity and trying some ml-agents-based path-finding game.
I trained my agent for some time and got good results (watched the cumulative reward reaches his top value on TensorBoard). But when I implemented my trained weights as the agent’s model, nothing happens.
I set the behavior to ‘Inference only’ and to ‘Default’ and nothing happens, and when I press the errows the agent moves (meaning it uses Heuristic method).
Please help me understand where I go wrong

I also attach the agent’s code
Thank you

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using Unity.MLAgents;
using Unity.MLAgents.Sensors;

public class PlayerAgentBasic : Agent
{
    [SerializeField]
    private float speed = 5.0f;

    [SerializeField]
    private float distanceRequired = 1.5f;

    [SerializeField]
    private GameObject target;

    [SerializeField]
    private Material successMateial;

    [SerializeField]
    private Material failMateial;

    [SerializeField]
    private Material defaultMateial;

    [SerializeField]
    private MeshRenderer groundMeshRenderer;

    #region Private Instance Variables
    private Rigidbody playerRigidBody;
    private Vector3 originalPosition;
    private Vector3 originalTargetPosition;
    #endregion

    public override void Initialize()
    {
        playerRigidBody = GetComponent<Rigidbody>();
        originalPosition = transform.localPosition;
        originalTargetPosition = target.transform.localPosition;
    }

    public override void OnEpisodeBegin()
    {
        transform.LookAt(transform.transform);
        target.transform.localPosition = new Vector3(originalTargetPosition.x, originalTargetPosition.y, Random.Range(-4, 4));
        transform.localPosition = originalPosition;
        transform.localPosition = new Vector3(originalPosition.x, originalPosition.y, Random.Range(-4, 4));
    }
    public override void CollectObservations(VectorSensor sensor)
    {
        sensor.AddObservation(transform.localPosition); // x y z (3 dim)
        sensor.AddObservation(target.transform.localPosition); // x y z (3 dim)
        sensor.AddObservation(playerRigidBody.velocity.x); // one dimention
        sensor.AddObservation(playerRigidBody.velocity.z); // one dimention

    }

    public override void OnActionReceived(float[] vectorAction)
    {
        var vectorForce = new Vector3();
        vectorForce.x = vectorAction[0];
        vectorForce.z = vectorAction[1];

        playerRigidBody.AddForce(vectorForce * speed);

        var distanceFromTarget = Vector3.Distance(transform.localPosition, target.transform.localPosition);

        if(distanceFromTarget < distanceRequired) // we are doing good
        {
            SetReward(1.0f);
            EndEpisode();
            StartCoroutine(swapGroundMaterial(successMateial, 0.5f));

        }
        if (transform.localPosition.y < 0) // falling of the floor
        {
            AddReward(-.5f);
            EndEpisode();

            StartCoroutine(swapGroundMaterial(failMateial, 0.5f));
        }
        // go back and punish the agent for falling

    }

    public override void Heuristic(float[] actionsOut) // telling the agent how to move
    {

        actionsOut[0] = Input.GetAxis("Horizontal"); // x
        actionsOut[1] = Input.GetAxis("Vertical"); // z
    }

    private IEnumerator swapGroundMaterial(Material mat, float time)
    {
        groundMeshRenderer.material = mat;
        yield return new WaitForSeconds(time);
        groundMeshRenderer.material = defaultMateial;

    }
}

Can someone please take a look?
I am stuck on this for a week now…:frowning:

I’ll flag for the team to have a look. Which version of ML Agents are you using?

Hi, thank you!
I’m using version 1.0.7

Hi @Shachartz0 ,
Sorry I missed your thread. When you hit play in the editor, what does your BehaviorParameters component look like? It does sound like it is in Heuristic mode instead of running inference.

Hi thank you for your reply.
I set the behavior type to ‘Inference only’ but when I hit play heuristic method is called and I don’t understand why

Yes, I understand that. Can you take a screen shot of the BehaviorParameters component of your agent after you hit play? I’d like to see if it was set back to Heuristic only.

It stays in inference only…

what does your decision requester component look like?

6990821--825551--upload_2021-3-30_21-33-47.png

And “PlayerMaze” is the model you want to use? Not PlayerMaze1 or PlayerMaze2?

would you be willing to share your project in some way? Is it on github or some other repo hosting service where I could take a look?