Policy for promoting leg height

Howdy guys.

So I have this walking / stumbling agent :slight_smile: that I want to priorities leg height ( that being flexed/outstreched leg) most of the time.

I’ve tried training something like this.

legHeight sampled ahead of time.

// just normalized values
rLegHeight = Mathf.Abs(rFoot.position.y - root.trasform.position.y) / legHeight
lLegHeight = Mathf.Abs(lFoot.position.y - root.trasform.position.y) / legHeight

// takes the highest and punish for not going all the way.
legPenalty = 1 - Mathf.Max(rLegHeight , lLegHeight)

But as Im training I feel like there must be a better way of doing this - Anyone else played around with this ?

For me I found that in order to promote higher feet clearance in locomotion, a negative reward on velocity near ground works well. Your solution seems legit, I wonder how well it works for you.
I took inspiration from “Generalizing Locomotion Style to New Animals With Inverse Optimal Regression” (Eq. 11) to formulate the following reward,

float getFeetGroundReward() {
        RaycastHit hit;
        float maxRaycastDist = 10;
       
        float feetGroundReward = 0f;
        for (int i = 0; i < numFeet; i++) {
            if (Physics.Raycast(feetObjects[i].transform.position, Vector3.down, out hit, maxRaycastDist))
            {
                feetGroundReward += Mathf.Exp(-10f * hit.distance) * Mathf.Pow(Vector3.ProjectOnPlane(feetRigidBodies[i].velocity, Vector3.up).magnitude, 0.5f);
            }
           
        }

        return -feetGroundReward;
}

The exponent is to reduce weight of the reward as the distance of the foot from the ground increases, and the squared root on the velocity is there to increase weight of small velocities. I also project on the XZ plane, i.e., I ignore vertical velocity.
You can read more in their paper about the motivation of using this reward, as well as further expansions they make to tackle the problem. I like their explanations, but I haven’t tried the additional proposed methods.

In “Minimizing Energy Consumption Leads to the Emergence of Gaits in Legged Robots” (Sec. 2.3) they refer to this issue, but they propose an environment solution rather than a reward one. They use an uneven terrain for the agent to learn to lift its feet implicitly.
I experimented with this, but eventually found that the former solution gave me the best results. It is true though that I haven’t tried complex terrain solutions as they propose in this paper, so this solution may be viable.

1 Like