I’m working on a soccer/football project. The longer you hold the button, the more powerful the shot. But, I have no idea how to implement this in ML Agents. Shot direction is pretty straightforward, continuous actions gets the x and y axes, even shot power would be a simple continuous action, BUT it has to be timed. If the shot is not timed it creates unbalanced gameplay where the player(you) have to wait 1 second to get full shot power where as the AI shoots full power instantly.
Please can someone point me in the right direction. I’m genuinely at a loss. Thank you.
Have you tried adding the current power level as an observation? The agent should be able to figure out timed behaviour, if it knows that there’s a relation between a button press (its actions) and the shot strength. [EDIT That would be discrete action (on/off), doing the same as a button press, no continuous strength control.]
Or maybe you don’t necessarily need fine-tuned control of the shot strength? If your game works with, let’s say 3 different values, maybe you could map those to different keys and discrete actions.
1 Like
I have got it working. Sort of. At the very least the heuristic controls are working the way i want them to work. When I press the button, the shot powers up and shoots the way I want it to.
public override void Heuristic(in ActionBuffers actionBuffers)
{
var actionsout = actionBuffers.ContinuousActions;
actionsout.Clear();
actionsout[1] = Input.GetAxisRaw("Vertical");
actionsout[2] = Input.GetAxisRaw("Horizontal");
if (Input.GetKeyDown("o"))
{
startTime = Time.time;
StartCoroutine("WaitPeriod");
pressed = true;
}
if (pressed == true)
{
if (Input.GetKeyUp("o"))
{
actionsout[0] = Mathf.Clamp((Time.time - startTime), 0, 1);
print(Time.time - startTime);
pressed = false;
StopCoroutine("WaitPeriod");
}
}
}
I achieved this with a coroutine. Maybe I’m just being paranoid, but something tells me it might not function that well within the actual training. I’ll update this when I have results. But thanks for the observation suggestion, I hadn’t thought of adding the power level/time as an observation. Me thinks it might just work.