AI Planner: Multiple questions regarding a "spider kills and eats prey" scenario +Burst Error BC1025

Greetings!

I’ve been looking for a GOAP solution for an upcoming Unity ECS project, and your AI Planner seems like it could be a nice fit for it!

In order to better understand the AI Planner, see how stable/polished it is, and if it’s a good match for the project, I’m trying to make a simulated “spider kills and then eats prey” scenario.

I used the Step-by-Step guide and video from this thread as a starting point, and I was able to make the spider eat food that it finds lying around, using these traits and an action:

Eatable Trait
Eater Trait
Eat Action

However, I also wanted the Spider to understand that it could obtain food by killing. So I created these Attacker and Damageable traits, as well as an Attack action:
Attacker Trait
Damageable Trait
Attack Action

I haven’t yet figured out how to make the target object only get destroyed and spawn food after its Health reaches 0. I suppose I could do it with an ICustomActionEffect<IStateData>, but I’m still in the process of understanding those. The Match3 sample project has a CustomSwapEffect class that implements ICustomActionEffect<StateData>, but then it starts doing some stuff with GetTraitBasedObjectId, GetTraitBasedObject and newState.GetTraitOnObject<Cell>, and I don’t know what ID or object I should use (if any). And what is the #if PLANNER_DOMAINS_GENERATED line for?
I would also like to use custom effects to add upper and lower bounds when increasing/decreasing values. For now I’m ignoring all of that and making every attack a 1-hit kill, but if anyone could point me in the right direction, that would be much appreciated!

I’m not sure if the way I set up my Attack action to create a deadBody object with Eatable and Location traits is enough, or if I also need a custom effect or callback that spawns a prefab that contains those traits (perhaps using deadBody as a parameter).

Another concern of mine is how the Attack agent’s position has to be exactly the same as its target. In most games, that wouldn’t really work, and you would need at least a nearby operator, and preferably a collision checker or line trace. The operators we can use now are very limited, and I think this would be a great place to use a visual scripting language similar to Unreal’s Blueprints, allowing for more complex conditions without leaving the Editor, but that’s a whole 'nother project. Hopefully Unity has something like that planned.


But going back to my setup… I added the Attack action to my SpiderAgentPlan. Does the order of the actions matter when setting up a plan? From what I understood, in this context a Plan contains a list of Actions that an agent can perform, so it kinda contradicts the usual definition of plan - a sequence of actions that an agent intends to perform. The real plan will only be determined while the game is running by the Planner, so it’s strange to call this list of available actions a “plan”, unless I’m misunderstanding something. I suggest changing the name to something like “Agent Profile”.

SpiderAgentPlan

I’ve been having some recurring problems with Unity and Visual Studio freezing when trying to run or reload the project, and then having to force close them. Yesterday I was getting a “Couldn’t create compiled assemblies folder” error that only stopped happening after closing Visual Studio, because VS had given itself exclusive access to that folder and I couldn’t even open it in Windows Explorer with my Admin account.

Today I’m having a different problem. In my simulation, I have one SpiderAgent, 3 pieces of food lying around, and a stationary Prey. The Spider eats all the food, but ignores the Prey. It’s probably related to this error I’m getting in the generated Attack.cs file:

I’ve tried disabling Burst compilation, and it makes the error go away, but the spider doesn’t change its behavior. I have a feeling that I’m missing something silly. I checked Attack.cs, and the error is happening in the ApplyEffects function. Line 95 only contains a “{”, so I imagine it’s actually referring to the line before it:
using (var deadBodyTypes = new NativeArray<ComponentType>(2, Allocator.Temp) {[0] = typeof(Eatable), [1] = typeof(Location), })

Full Attack.cs code

using System;
using Unity.Collections;
using Unity.Entities;
using Unity.Jobs;
using Unity.AI.Planner;
using Unity.AI.Planner.DomainLanguage.TraitBased;
using Unity.Burst;
using AI.Planner.Domains;
using AI.Planner.Domains.Enums;

namespace AI.Planner.Actions.SpiderAgentPlan
{
    [BurstCompile]
    struct Attack : IJobParallelForDefer
    {
        public Guid ActionGuid;
       
        const int k_agentIndex = 0;
        const int k_targetIndex = 1;
        const int k_MaxArguments = 2;

        [ReadOnly] NativeArray<StateEntityKey> m_StatesToExpand;
        StateDataContext m_StateDataContext;

        internal Attack(Guid guid, NativeList<StateEntityKey> statesToExpand, StateDataContext stateDataContext)
        {
            ActionGuid = guid;
            m_StatesToExpand = statesToExpand.AsDeferredJobArray();
            m_StateDataContext = stateDataContext;
        }

        public static int GetIndexForParameterName(string parameterName)
        {
           
            if (string.Equals(parameterName, "agent", StringComparison.OrdinalIgnoreCase))
                 return k_agentIndex;
            if (string.Equals(parameterName, "target", StringComparison.OrdinalIgnoreCase))
                 return k_targetIndex;

            return -1;
        }

        void GenerateArgumentPermutations(StateData stateData, NativeList<ActionKey> argumentPermutations)
        {
            var agentFilter = new NativeArray<ComponentType>(3, Allocator.Temp){[0] = ComponentType.ReadWrite<Unity.AI.Planner.DomainLanguage.TraitBased.Location>(),[1] = ComponentType.ReadWrite<AI.Planner.Domains.Moveable>(),[2] = ComponentType.ReadWrite<AI.Planner.Domains.Attacker>(),  };
            var targetFilter = new NativeArray<ComponentType>(2, Allocator.Temp){[0] = ComponentType.ReadWrite<Unity.AI.Planner.DomainLanguage.TraitBased.Location>(),[1] = ComponentType.ReadWrite<AI.Planner.Domains.Damageable>(),  };
            var agentObjectIndices = new NativeList<int>(2, Allocator.Temp);
            stateData.GetTraitBasedObjectIndices(agentObjectIndices, agentFilter);
            var targetObjectIndices = new NativeList<int>(2, Allocator.Temp);
            stateData.GetTraitBasedObjectIndices(targetObjectIndices, targetFilter);
            var LocationBuffer = stateData.LocationBuffer;
           
            for (int i0 = 0; i0 < agentObjectIndices.Length; i0++)
            {
                var agentIndex = agentObjectIndices[i0];
                var agentObject = stateData.TraitBasedObjects[agentIndex];
               
           
            for (int i1 = 0; i1 < targetObjectIndices.Length; i1++)
            {
                var targetIndex = targetObjectIndices[i1];
                var targetObject = stateData.TraitBasedObjects[targetIndex];
               
                if (!(LocationBuffer[agentObject.LocationIndex].Position == LocationBuffer[targetObject.LocationIndex].Position))
                    continue;

                var actionKey = new ActionKey(k_MaxArguments) {
                                                        ActionGuid = ActionGuid,
                                                       [k_agentIndex] = agentIndex,
                                                       [k_targetIndex] = targetIndex,
                                                    };
                argumentPermutations.Add(actionKey);
            }
            }
            agentObjectIndices.Dispose();
            targetObjectIndices.Dispose();
            agentFilter.Dispose();
            targetFilter.Dispose();
        }

        StateTransitionInfoPair<StateEntityKey, ActionKey, StateTransitionInfo> ApplyEffects(ActionKey action, StateEntityKey originalStateEntityKey)
        {
            var originalState = m_StateDataContext.GetStateData(originalStateEntityKey);
            var originalStateObjectBuffer = originalState.TraitBasedObjects;
            var originaltargetObject = originalStateObjectBuffer[action[k_targetIndex]];
            var originalagentObject = originalStateObjectBuffer[action[k_agentIndex]];

            var newState = m_StateDataContext.CopyStateData(originalState);
            var newDamageableBuffer = newState.DamageableBuffer;
            var newAttackerBuffer = newState.AttackerBuffer;
            var newLocationBuffer = newState.LocationBuffer;
            TraitBasedObject newdeadBodyObject;
            TraitBasedObjectId newdeadBodyObjectId;
            using (var deadBodyTypes =  new NativeArray<ComponentType>(2, Allocator.Temp) {[0] = typeof(Eatable), [1] = typeof(Location), })
            {
                newState.AddObject(deadBodyTypes, out newdeadBodyObject, out newdeadBodyObjectId);
            }
            {
                    var @Damageable = newDamageableBuffer[originaltargetObject.DamageableIndex];
                    @Damageable.Health -= newAttackerBuffer[originalagentObject.AttackerIndex].AttackDamage;
                    newDamageableBuffer[originaltargetObject.DamageableIndex] = @Damageable;
            }
            {
                    newState.SetTraitOnObject<UnderAttack>(default(UnderAttack), ref originaltargetObject);
            }
            {
                    var @Location = newLocationBuffer[newdeadBodyObject.LocationIndex];
                    @Location.Position = newLocationBuffer[originaltargetObject.LocationIndex].Position;
                    newLocationBuffer[newdeadBodyObject.LocationIndex] = @Location;
            }

           
            newState.RemoveTraitBasedObjectAtIndex(action[k_targetIndex]);

            var reward = Reward(originalState, action, newState);
            var StateTransitionInfo = new StateTransitionInfo { Probability = 1f, TransitionUtilityValue = reward };
            var resultingStateKey = m_StateDataContext.GetStateDataKey(newState);

            return new StateTransitionInfoPair<StateEntityKey, ActionKey, StateTransitionInfo>(originalStateEntityKey, action, resultingStateKey, StateTransitionInfo);
        }

        float Reward(StateData originalState, ActionKey action, StateData newState)
        {
            var reward = -1f;

            return reward;
        }

        public void Execute(int jobIndex)
        {
            m_StateDataContext.JobIndex = jobIndex; //todo check that all actions set the job index

            var stateEntityKey = m_StatesToExpand[jobIndex];
            var stateData = m_StateDataContext.GetStateData(stateEntityKey);

            var argumentPermutations = new NativeList<ActionKey>(4, Allocator.Temp);
            GenerateArgumentPermutations(stateData, argumentPermutations);

            var transitionInfo = new NativeArray<AttackFixupReference>(argumentPermutations.Length, Allocator.Temp);
            for (var i = 0; i < argumentPermutations.Length; i++)
            {
                transitionInfo[i] = new AttackFixupReference { TransitionInfo = ApplyEffects(argumentPermutations[i], stateEntityKey) };
            }

            // fixups
            var stateEntity = stateEntityKey.Entity;
            var fixupBuffer = m_StateDataContext.EntityCommandBuffer.AddBuffer<AttackFixupReference>(jobIndex, stateEntity);
            fixupBuffer.CopyFrom(transitionInfo);

            transitionInfo.Dispose();
            argumentPermutations.Dispose();
        }

       
        public static T GetAgentTrait<T>(StateData state, ActionKey action) where T : struct, ITrait
        {
            return state.GetTraitOnObjectAtIndex<T>(action[k_agentIndex]);
        }
       
        public static T GetTargetTrait<T>(StateData state, ActionKey action) where T : struct, ITrait
        {
            return state.GetTraitOnObjectAtIndex<T>(action[k_targetIndex]);
        }
       
    }

    public struct AttackFixupReference : IBufferElementData
    {
        internal StateTransitionInfoPair<StateEntityKey, ActionKey, StateTransitionInfo> TransitionInfo;
    }
}

Does anyone have an idea how to fix this? I am stuck. =\

These are my GameObject setups for the Decision and Trait components, if it helps:
SpiderAgent


(The GoapAgent/Attack Callback Method is empty for now, and the GoapAgent/Eat callback destroys the eatable GameObject, as instructed by the Step-by-step guide)

Prey

Thanks in advance!

Since my original post ended up very long and intimidating, I’ve decided to make a TL;DR version, separating my questions from my suggestions:

If anyone could reply to any of these, that would be much appreciated:

Questions:
1) What is the correct way of spawning an Entity with certain traits as the result of an action? Did I do it correctly in the “Effects>Objects Created” section of my Attack Action? (you can see it in the 6th spoiler of my previous post)
2) Is it possible to access a Trait based object created by an Action Effect and “attach” it to a GameObject or Entity in an Action callback?
3) Regarding the function GetTraitBasedObjectId, what is the int traitBasedObjectIndex parameter supposed to be?
4) What is the #if PLANNER_DOMAINS_GENERATED line for in the Match3 sample project’s CustomSwapEffect?
5) Does action order matter when setting up a Plan Definition?
6) Based on the images I shared in the previous post, do you see anything that could be stopping my SpiderAgent from using the Attack action on Prey? The SpiderAgent should understand that attacking Prey spawns an Eatable object, allowing it to Eat.

Suggestions:
A) Min and Max values for Trait fields
B) A way to define more complex Preconditions in our Actions, making it possible to use vector distance, collisions, line traces, and other things that a game might need to use as a condition.
C) Conditional effects. For example: If health is greater than zero, decrease health. If health equals zero, destroy target.
D) Change the name “Plan Definition” to something less ambiguous, like “Agent Profiles”.

As for the Burst error BC1025 I shared earlier, a friend of mine suggested that this line indicated by the error in the generated file…
using (var deadBodyTypes = new NativeArray<ComponentType>(2, Allocator.Temp) {[0] = typeof(Eatable), [1] = typeof(Location), })
… should be like this:
using (var deadBodyTypes = new NativeArray<ComponentType>(2, Allocator.Temp, NativeArrayOptions.UninitializedMemory) {[0] = ComponentType.ReadWrite<Eatable>(), [1] = ComponentType.ReadWrite<Location>(), })

By disabling “Reload Domain” in the Enter Play Mode Options, I was able to keep that code instead of it being regenerated, and that made the error go away! It didn’t fix the Spider’s non-attacking behavior, though, so that must be related to something else.

1 Like

Thank you for giving the AI Planner a go. I’ll answer your questions below.

  1. Our current package does not have the entity-based (i.e. DOTS) workflow exposed, so if your project is entities based, you’d still need to have a GameObject for receiving the action callbacks. Inside of those callbacks you could do whatever you liked to your entity-based world. However, this will change soon when we release an updated package that support a DOTS workflow.
  2. Not currently. Only parameters assigned on the action can be bound to parameters on an action callback. The reason that planner actions allow for objects to be created are to match what you’ll be doing in your game world. In other words, the planner is simulating the future based on actions that are available, so if the creation of objects would happen normally and would affect future planning decisions, then those need to be represented in the model.
  3. The traitBasedObjectIndex is an index that comes from the action. It matches the ordinal of the parameters you defined in your action. With that index you can get back out the ObjectID, which allows you to access traits on those objects (from the planner state). If you need to create relations in one of your traits (e.g. Carrier / carriable relationship), then you would set the field type to TraitBasedObject to be able to refer to another object.
  4. The PLANNER_DOMAINS_GENERATED define gets set when code gets generated. This allows you to use defines around any custom code that makes use of trait code that gets generated. Otherwise, if you ever deleted the generated code directories, then you’d get compile errors. This is changing with a future release as we simply will skip compiling the custom assembly at all if previous code has not been generated.
  5. Action order does not matter because the planner will evaluate all actions based on the current state and the preconditions of each action.
  6. It’s possible that in the plan for the spider it has it eating the prey it just killed. Have you tried looking at the AI Planner visualizer to see what gets planned? Is the action showing up anywhere? If not, then it is probably an issue with modeling the domain of the problem. You can click on any of the state nodes to see what state the planner thinks it has at any point in the plan. If the issue is with seeing any visual change, well, you’d need to implement world-state changes in your callback. So, currently you said that Attack is empty. You’d need to spawn a new prefab/GameObject with the correct traits to match what your Attack action says will happen. Then you would set the callback to “Use world state”. For an example of spawning prefabs into the world take a look at the VacuumRobot sample.

A) I’ve recorded this suggestion on our Favro board
B) We have a LocationDistance custom reward modifier, but we don’t currently have a custom precondition for nearby distances; I’ll add the suggestion to our Favro board; Support for complex preconditions are through ICustomPreconditions: https://docs.unity3d.com/Packages/com.unity.ai.planner@0.2/manual/CustomPlannerExtensions.html; We don’t plan to over-complicate the UI to allow for complex statements as this will probably be handled later down the line by Unity’s visual scripting system
C) Complex effects are supported through ICustomActionEffect: https://docs.unity3d.com/Packages/com.unity.ai.planner@0.2/manual/CustomPlannerExtensions.html
D) I’ve shared this with the rest of the team and we’ll discuss.

We’ve fixed this already in our development branch, but for now you’ll have to disable burst.

1 Like

Thanks for reading and replying, @amirebrahimi_unity ! It was very helpful! I wanted to share an update on my progress. Hopefully it’ll help others who are interested in the AI Planner and might be struggling with it.

It turns out that the reason why my SpiderAgent wasn’t moving towards any Prey was because the Prey prefab had a Moveable trait, which the Navigate action does not allow in its Destination parameter. Once I removed the Moveable trait, the spider was able to move to them. It would be nice if Navigate didn’t have that restriction, but I already knew that one point I would have to make my own Navigator, anyway.

After reading your explanation of traitBasedObjectIndex and realizing that I could access an ActionKey with an indexer (e.g. action[0] and action[1]), I was able to make a custom effect for my Attack action that decreases a target’s health while respecting the minimum value of 0. It also destroys the target once its health equals zero.

I was also modifying the TraitBasedObject that I got from newState.GetTraitBasedObject at first, but then I noticed that it’s a struct and switched to newState.SetOnObject.

Here’s the code for my Custom Action Effect, if you’d like to see it:

public class DamagePreyEffect : ICustomActionEffect<StateData>
{
#if PLANNER_DOMAINS_GENERATED
    public void ApplyCustomActionEffectsToState(StateData originalState, ActionKey action, StateData newState)
    {
        TraitBasedObject originalAgentObject = originalState.GetTraitBasedObject(originalState.GetTraitBasedObjectId(action[0]));
        TraitBasedObject originalTargetObject = originalState.GetTraitBasedObject(originalState.GetTraitBasedObjectId(action[1]));

        TraitBasedObject newTargetObject = newState.GetTraitBasedObject(newState.GetTraitBasedObjectId(action[1]));

        float attackDamage = originalState.GetTraitOnObject<Attacker>(originalAgentObject).AttackDamage;

        Damageable damageable = originalState.GetTraitOnObject<Damageable>(originalTargetObject);

        damageable.Health = Mathf.Max(0, damageable.Health - attackDamage);

        newState.SetTraitOnObject(damageable, ref newTargetObject);

        if (damageable.Health == 0)
        {
            newState.RemoveObject(newTargetObject);

            newState.AddObject(
                new NativeArray<ComponentType>(2, Allocator.Temp)
                {
                    [0] = ComponentType.ReadWrite<Eatable>(),
                    [1] = ComponentType.ReadWrite<Location>()
                },
                out TraitBasedObject deadBody,
                out _
            );

            Location targetLocation = newState.GetTraitOnObject<Location>(originalTargetObject);
            newState.SetTraitOnObject(targetLocation, ref deadBody);
        }
    }
#endif
}

The last thing that I needed to do for my system to work was create the attack callback. I thought that by modifying newState with SetTraitOnObject, RemoveObject or AddObject on my ApplyCustomActionEffectsToState function, my object would have its traits updated automatically after the Action was performed, and that I only needed callbacks for additional behavior. But that’s not the case, so I added the callback below, and then the spider started acting correctly:

// In my SpiderAgent class:
    public void Attack(GameObject damageable, float previousHealth, float attackDamage)
    {
#if PLANNER_DOMAINS_GENERATED
        ITraitData damageableTrait = damageable.GetComponent<ITraitBasedObjectData>().TraitData.FirstOrDefault(t => t.TraitDefinitionName == nameof(Damageable));
        float newHealth = Mathf.Max(0, previousHealth - attackDamage);
        damageableTrait.SetValue("Health", newHealth);

        if (newHealth <= 0)
        {
            damageable.GetComponent<PreyAgent>().DieAndSpawnFood();
        }
#endif
    }

// In my PreyAgent class:
    public GameObject eatablePrefab;

    public void DieAndSpawnFood()
    {
        Instantiate(eatablePrefab, transform.position + Vector3.right * -2, transform.rotation); // The offset is so that the Spider has to move a little bit and the Eat action doesn't finish instantly
        Destroy(gameObject);
    }

Now I am dealing with two new problems:
1) The spider is a total psychopath! I wanted to test if it was attacking prey for the right reason (to spawn food, so that it could Eat), but even after removing the part of the effect that spawns food from the Attack action (in the ActionEffect itself AND in the callback), it kept attacking all prey in the scene. Attacking has a cost (-1), and Eating has a Reward (999). It’s not gaining anything by just attacking, so I don’t get why the spider keeps doing it. I suspect that it might be related to my Plan Definition’s search settings. What are the Heuristic, Bounds and Discount Factor parameters supposed to be? :eyes:
2) If I have a single spider in my scene, it always moves towards the closest target. However, If I add multiple spiders to my scene, they all go to the same target (and then all except one of them fail and freeze). How can I make it so that each spider thinks independently?

If you could help me with any of those two problems, that would be much appreciated! :slight_smile:

Thanks again and have a great weekend! :smile:

Eat (+999)[quote=“Blueprinter, post:4, topic: 779484, username:Blueprinter”]
I was also modifying the TraitBasedObject that I got from newState.GetTraitBasedObject at first, but then I noticed that it’s a struct and switched to newState.SetOnObject.
[/quote]

Yes, this is a common point of confusion for users. Unfortunately, we cannot provide ref returns of the trait data from the underlying data buffers, currently.

Ah. This is another common misconception. The planner action is simply a model of what should happen when an action is performed. There is no enforcement of the action in the live game state; that is what the user action code is responsible for. In your action code, you’ll need to set all of the changed trait data on the various gameObject TraitComponents, then you can pickup the changes via a world query (as set per action on the DecisionController-> Available actions from the plan → Next State Update). If the modeled action effects (action definition) match exactly the executed action effects (from your action callback), you can skip the query and “Use Next Plan State” instead. Using the next plan state instead of querying the world is more performant, but it requires the exact modeling of your action. Otherwise, your controller can become out of sync with the true game state.

First, ensure your agent is updated with the latest, accurate state data. This should be true if you’re querying the world for the new state at the end of each action. You can inspect the state data in the plan visualizer window (see: docs). The root/initial state of the plan should reflect the data from the most recent state update after an action has completed. Check to make sure your new food has been added to the state representation.

I don’t suspect this is your problem, but I won’t rule it out either. I’ll try to a high-level/conceptual understanding of these concepts.

Our planner operates by incrementally searching through the space of possible plans, attempting to arrive at the “best” plan. In many cases, this process may take awhile, and we’d like planning agents to begin enacting plans before they’ve been fully completed (i.e. found the complete sequence of actions to some terminal or goal state). The way we select one of these “partial” plans is by ordering them according to their expected total reward, choosing the plan with the highest expected reward as our “best” plan to enact. So how do we estimate the total reward for a partial plan?

I’m going to modify your use case as an example. Consider we are making a plan for an agent to eat 3 foods and it can attack prey to create food. In this simplified example, I’m assuming a single attack kills the prey and creates food. Also, eat gains 999 reward while attack has a cost of -1.

Our partial plan may look like this:

Attack (-1) → Eat (+999) → ???

Eventually, we want to arrive at a complete plan like this:

Attack (-1) → Eat (+999) → Attack (-1) → Eat (+999) → Attack (-1) → Eat (+999) → Done!

This complete plan has a total reward of -1 + 999 -1 + 999 -1 + 999 = 2,994. Perfect. But how do we estimate that our partial plan

Attack (-1) → Eat (+999) → ???

will become

Attack (-1) → Eat (+999) → Attack (-1) → Eat (+999) → Attack (-1) → Eat (+999) → Done!

This is where the heuristic comes in. It’s an estimate of the reward to be gained in the yet incomplete portion of the plan. The total plan estimate is simply the part we know (Attack (-1) → Eat (+999) = 998) plus the part we are estimating ( ??? → heuristic estimate). In this simple case, an easy estimate would be to guess we’ll attack then eat twice more, resulting in Heuristic = (999-1) * food_left_to_be_eaten, which would lead to a total plan estimate of:

Attack (-1) → Eat (+999) → ??? ( heuristic = (999-1) * 2)
Partial plan reward = -1 + 999 = 998
Heuristic from end of plan onward: (999 - 1) * 2 = 1,996
Estimated total reward = 998 + 1,996 = 2,994!

Now, obviously it’s hard to come up with accurate estimates in every possible scenario (if you could, you wouldn’t actually need a planner; you could just use a utility AI system). The planner refines the estimates by growing the plans, as the more plan you know and the fewer steps you have to estimate, the better your total estimate will be. But the planner needs to reason intelligently about which plans to grow in its limited computational budget. For this reason, we actually make three heuristic estimates: an optimistic estimate, an average/expected estimate, and a pessimistic estimate. Together, these make up a BoundedValue. Without delving too much into the details, the optimistic estimate helps the planner determine which plans to grow while the pessimistic estimate helps the planner prune out poor performing plans. Let’s extend the above example with these two additional estimates:

Consider an optimistic scenario: there’s just food lying around already, so there’s no need to attack. It might instead go like this:

Attack (-1) → Eat (+999) → Eat (+999) → Eat (+999) → Done!

So the partial plan

Attack (-1) → Eat (+999) → ???

can have an optimistic projection of OptimisticHeuristic = (+999) * food_left_to_be_eaten

Attack (-1) → Eat (+999) → ??? ( OptimisticHeuristic = (+999) * 2)
Partial plan reward = -1 + 999 = 998
Heuristic from end of plan onward: (+999 ) * 2 = 1,998
Estimated total reward = 998 + 1,998 = 2,996! (better than 2,994!)

Similarly, perhaps there are no creatures to attack, leading the agent to starve (-1000). This pessimistic projection may look like:

Attack (-1) → Eat (+999) → Starve (-1000) → Done!

So, again, the partial plan

Attack (-1) → Eat (+999) → ???

can have an pessimistic projection of PessimisticHeuristic = -1000

Attack (-1) → Eat (+999) → ??? ( PessimisticHeuristic = -1000)
Partial plan reward = -1 + 999 = 998
Heuristic from end of plan onward: -1000
Estimated total reward = 998 - 1000 = -2 (much worse than 2,994!)

So, all together, our heuristic estimate for this example would yield [upperbound = 2,996, expected = 2,994, lower = -2] for this particular partial plan (Attack → Eat → ???). Heuristics are computed per tail state of each partial plan, so this particular exercise is specific to the ??? state after the two actions have taken place. The heuristic values would be different for a partial plan of (Attack → Eat → Attack → Eat → ???) and so on. Eventually, the planner will arrive at the complete plan (Attack → Eat → Attack → Eat → Attack → Eat → Done!), which will have only exact rewards and no heuristic estimate. For more info on designing a heuristic, see here.

Finally, the discount factor. In many situations, you want your agents to weigh near term actions more importantly than actions far into the future. Think of it as instant vs delayed gratification. The discount factor (value [0 → 1]) affects this weighting by the agent, with values close to 0 weighting immediate actions more highly. Conversely, a value of 1 (the default) weights costs/rewards equally, no matter where they take place in the plan. I’ll give two examples with a discount of 0.9.

The complete plan:

Attack (-1) → Eat (+999) → Attack (-1) → Eat (+999) → Attack (-1) → Eat (+999) → Done!
Total reward = -1 * 0.9^0 + 999 * 0.9^1 - 1 * 0.9^2 + 999 * 0.9^3 - 1 * 0.9^4 + 999 * 0.9^5 = 2,214.80441

The partial plan:

Attack (-1) → Eat (+999) → ??? (here, I’ll use heuristic = (999-1) * 2)
Total estimated reward = -1 * 0.9^0 + 999 * 0.9^1 + (999-1) * 2 * 0.9^2 = 2,514.86

Here, you can see that the partial plan estimate overestimates the complete plan’s reward, as it doesn’t take into account the action ordering and discounting. This is normal. As the plan is extended, it will converge to the true total reward.

Hope that clears up rewards, partial plans, and heuristics!

Ah. I suspect I know what the issue is here. Every spider picks up all other spiders in its state representation. The planner for each spider then concludes that a specific spider who is close to a prey should go to that prey. However, what you want is for the planner to decide what each individual spider should do, and the planner doesn’t know which spider in the state is the “this is me” spider. It’s an issue that has cropped up from time to time, and we are adding support to make this easier. For now, on each spider game object, you can build the world query (DecisionController->Include Objects->World Query) by adding a filter for “From GameObject”, assigning itself. You’ll also want to add “Without Traits” filter (choosing your spider trait(s)) to any other query components, which will filter out the other spiders from the state. Hopefully this will fix the issue. Let us know!

You too!

1 Like

@TrevorUnity : Thank you so much for the detailed reply! :smile:

Interesting! I added the “From GameObject” filter to my SpiderAgent prefab, pointing to itself. I don’t have any other objects with a DecisionController, so I didn’t change any other queries. But all of my spiders stopped moving now, so I must be missing something. Any ideas?

Can you share what looks like your query in the decision controller ?

Sure!

I think I see the problem now. The query isn’t just used when the spider needs to find itself. It’s also used to find all the other objects used in the spider’s actions.

Since I replaced “All world objects” with the spider’s own GameObject, it can’t find any object to attack or eat.

I think this is the query that I need: “All world objects, except spiders that aren’t this one”. Is it possible to write a custom query in C# and use it in the Decision Controller? It seems like it’s not possible to define such a query using only the editor.

To do that, if your trait Attacker is only used on spiders, you could create a query like :

  • From Object “SpiderAgent”
  • With Traits Attacker
  • OR
  • Without Traits Attacker

You could create a Trait Spider to tag your spider specifically if attacker is used on other type of objects.

Also, you can create custom Query by implementing BaseQueryFilter and use QueryAttribute, but it’s not documented yet.

1 Like

@mplantady_unity : That worked! Thank you! :smile:

I also removed the “With Traits Attacker” part of the query, and it still worked.

I guess this concludes my experiment for now. Thank you all of your patience and replies. I really like where this tool is going. I’m looking forward to the next releases! Keep it up!

When do you think a DOTS workflow will be released? And when it does, would it support the ability for a Precondition to depend on the execution of a system? For example, right now the only way that a precondition can “call” a system is for it to, frame by frame,

  1. Create a message entity
  2. The system queries the entity and performs expensive computations and writes to an entity
  3. The AI Planner reads from the written entity and fetches that to compute the precondition
    This would currently take 3 frames, which does not fit into the AI Planner’s single-frame heuristics calculations. If this workflow is not recommended, are there any other ways to spread a heuristic run across a single frame?

Unfortunately, we don’t have a timeline for this yet. Our current priority has been on iterating with our internal customers on the authoring workflows, public API, etc, as it impacts far more of our user base. Some of these changes should allow for custom use of the system, which can enable users to build their own DOTS-compatible systems, but an official workflow from us is a longer term endeavor.

For the planning process to maintain correctness, it requires that conditions depend only on data contained within the state as any external data may change, causing violation of the Markov property. Can I ask what type of external information you wish to condition the plans on? Can it be wrapped into the state via traits/objects?

As for long running times, many of our jobs—including the heuristic evaluation job—can run over multiple frames. In practice, though, we haven’t encountered any significantly complex heuristics that require multiple frames to compute. Can you share more about the heuristic you’re using?

1 Like

Thank you, guess I’ll continue using a behavior tree implementation. The AI Planner looks really great nonetheless

Some of the conditions depend on logic that need a system to execute - logic that requires querying for entities. For example, the major compatibility breaker in my codebase are the systems that calculate paths - they need to query for entities which requires segueing into a system. These entities could be “enemy markers”, which influences which areas of the map are considered safe, “destination reservations”, which denote an area already occupied by an ally, and so forth. I guess those entities could be converted to GameObjects with traits, and the AI Planner will “query” those instead. Does the AI Planner acknowledge Trait-owning GameObjects that have been converted to Entities via Convert and Inject?

Sorry, I miscommunicated what part of the AI Planner I need to get around. It’s the challenge of spreading computation across multiple frames, and having the AI Planner be aware of this and wait accordingly that I want to solve. Right now all my ECS systems can be divided into two categories: message and ongoing. Message systems query for a message entity, which upon being created triggers those systems into action. These require synchronization on the AI Planner’s part, since the AI Planner has to execute the action that creates the message entity, wait one frame for the system to process that message, and finally query from an entity the result of the computation. Meanwhile ongoing systems execute continuously and are often used to calculate things like vision and ally proximity, and can be queried whenever. Maybe the message systems can be converted into IJobs and the system queries into GameObjects with traits so that my code will fit into AI Planner.

As for running over multiple frames, how can I get the AI Planner to run the heuristic across the duration of a frame? Something like returning a JobHandle to the AI Planner

Your own or a publicly available implementation? I’m always interested in other DOTS-compatible AI tools.

Ah. I see. I misunderstood initially. But yes, our jobs are not set up such that they can be delayed a frame or two to wait on a system. Heuristics are evaluated for every non-terminal state searched during planning, which typically number in the hundreds or thousands. Such dependencies would slow the computation down immensely.

Not in the publicly released version, no. @amirebrahimi_unity has been moving our backing data to Entities so you can use GameObjects or Entities with traits. We don’t have a release date yet for that change, but it’s in the works.

Hmm. The issue I see with this is that the heuristics are evaluated from inside of a job, so you wouldn’t be able to kick off another job to do the other computation. It seems that you’d want to intercept which states will be evaluated, then schedule an intermediate job as a dependency of the heuristic job. Injecting into our planning jobs pipeline could be potentially very powerful, but it’s likely not a feature that would be a high priority for us. Let me stew on this for a bit.

1 Like

StateTransitionInfoPair error “inaccesible due to protection level”
I have looked everywhere to get some insight to what this function is about to no avail.
where did you find this?
Help!

StateTransitionInfoPair<StateEntityKey, ActionKey, StateTransitionInfo> ApplyEffects(ActionKey action, StateEntityKey originalStateEntityKey)
{
var originalState = m_StateDataContext.GetStateData(originalStateEntityKey);
var originalStateObjectBuffer = originalState.TraitBasedObjects;
var originaltargetObject = originalStateObjectBuffer[action[k_targetIndex]];
var originalagentObject = originalStateObjectBuffer[action[k_agentIndex]];

var newState = m_StateDataContext.CopyStateData(originalState);
var newDamageableBuffer = newState.DamageableBuffer;
var newAttackerBuffer = newState.AttackerBuffer;
var newLocationBuffer = newState.LocationBuffer;
TraitBasedObject newdeadBodyObject;
TraitBasedObjectId newdeadBodyObjectId;
using (var deadBodyTypes = new NativeArray(2, Allocator.Temp) {[0] = typeof(Eatable), [1] = typeof(Location), })
{
newState.AddObject(deadBodyTypes, out newdeadBodyObject, out newdeadBodyObjectId);
}
{
var @Damageable = newDamageableBuffer[originaltargetObject.DamageableIndex];
@Damageable.Health -= newAttackerBuffer[originalagentObject.AttackerIndex].AttackDamage;
newDamageableBuffer[originaltargetObject.DamageableIndex] = @Damageable;
}
{
newState.SetTraitOnObject(default(UnderAttack), ref originaltargetObject);
}
{
var @location = newLocationBuffer[newdeadBodyObject.LocationIndex];
@Location.Position = newLocationBuffer[originaltargetObject.LocationIndex].Position;
newLocationBuffer[newdeadBodyObject.LocationIndex] = @location ;
}

newState.RemoveTraitBasedObjectAtIndex(action[k_targetIndex]);

var reward = Reward(originalState, action, newState);
var StateTransitionInfo = new StateTransitionInfo { Probability = 1f, TransitionUtilityValue = reward };
var resultingStateKey = m_StateDataContext.GetStateDataKey(newState);

return new StateTransitionInfoPair<StateEntityKey, ActionKey, StateTransitionInfo>(originalStateEntityKey, action, resultingStateKey, StateTransitionInfo);
}

float Reward(StateData originalState, ActionKey action, StateData newState)
{
var reward = -1f;

return reward;
}

public struct AttackFixupReference : IBufferElementData
{
internal StateTransitionInfoPair<StateEntityKey, ActionKey, StateTransitionInfo> TransitionInfo;
}
}

Is this an error you got from the Planner generated code or by trying to access it from your own code ?

This type is located in the AI Planner package runtime assembly (in PolicyGraphData.cs), it holds information about the transition between 2 states that occurs after an Action was applied.

This type is internal and cannot be accessed in your own code, unless you access it in a special assembly definition that have friendly access to this Planner assembly.

1 Like

In the first post, “Full Attack.cs”. I was trying to follow what the OP had done. Also trying to duplicate the Otto example but much has changed and not enough documentation. Otto has a lot of good info in it even if it will not compile. stuck on the foreach loops but their seems to be no equivalent “that I comprehend… yet”