29 Million GC Alloc - A* Pathfinding - How to Improve?

I implemented the A* Pathfinding algorithm into my game, but occasionally I’ll get massive GC Alloc amounts which cause the game to freeze for several seconds.

I’m guessing it’s the adding and removing to lists causing the GC alloc? But I’m not sure.

These really huge CG Alloc spikes occur when the path cannot find its destination, so it searches every node before returning null (normally it will search <100 nodes, however if it can’t find a path it searches all of them 40,000+), however even searching just 50-100 nodes can cause 1-2million GC Alloc which also causes some stuttering, so I’d like to improve on the garbage collection overall.

How I get the NodePath in my MoveTo coroutine.

        List<Node> nodePath;
        nodePath = Pathfinding.instance.FindPath(aiController.transform.position, ai.target.position);

FindPath Method

    public List<Node> FindPath(Vector3 a_StartPos, Vector3 a_TargetPosition)
    {
        Node startNode = grid.NodeFromWorldPosition(a_StartPos);
        Node targetNode = grid.NodeFromWorldPosition(a_TargetPosition);

        List<Node> openList = new List<Node>();
        HashSet<Node> closedList = new HashSet<Node>();

        openList.Add(startNode);

        int x = 0;
        while (openList.Count > 0)
        {
            x++;
            Node currentNode = openList[0];
            for (int i = 1; i < openList.Count; i++)
            {
                if (openList[i].fCost < currentNode.fCost || openList[i].fCost == currentNode.fCost && openList[i].hCost < currentNode.hCost)
                {
                    currentNode = openList[i];
                }
            }
            openList.Remove(currentNode);
            closedList.Add(currentNode);

            if (currentNode == targetNode)
            {
                List<Node> finalPath = new List<Node>();
                Node curNode = targetNode;

                while (curNode != startNode)
                {
                    finalPath.Add(curNode);
                    curNode = curNode.parent;
                }

                finalPath.Reverse();

                GetFinalPath(startNode, targetNode);
                Debug.Log("Nodes Searched :" + x);
                return finalPath;
            }

            foreach (Node neighborNodes in grid.GetNeightboringNodes(currentNode))
            {
                if ((!neighborNodes.isWall && !neighborNodes.isOccupied) || closedList.Contains(neighborNodes))
                {
                    continue;
                }

                int moveCost = currentNode.gCost + GetGetManhattenDistance(currentNode, targetNode);

                if (moveCost < neighborNodes.fCost || !openList.Contains(neighborNodes))
                {
                    neighborNodes.gCost = moveCost;
                    neighborNodes.hCost = GetGetManhattenDistance(neighborNodes, targetNode);
                    neighborNodes.parent = currentNode;

                    if (!openList.Contains(neighborNodes))
                    {
                        openList.Add(neighborNodes);
                    }
                }
            }
        }
        Debug.Log("Nodes Searched :" + x);
        return null;
    }

Method for determining neighboring nodes

    public List<Node> GetNeightboringNodes(Node a_Node)
    {
        List<Node> neighboringNodes = new List<Node>();

        for (int x = -1; x <= 1; x++)
        {
            for (int y = -1; y <= 1; y++)
            {
                for (int z = -1; z <= 1; z++)
                {
                    //if we are on the node tha was passed in, skip this iteration.
                    if (x == 0 && y == 0 && z == 0)
                    {
                        continue;
                    }

                    int checkX = a_Node.gridX + x;
                    int checkY = a_Node.gridY + y;
                    int checkZ = a_Node.gridZ + z;

                    //Make sure the node is within the grid.
                    if (checkX >= 0 && checkX < gridSizeX && checkY >= 0 && checkY < gridSizeY && checkZ >= 0 && checkZ < gridSizeZ)
                    {
                        neighboringNodes.Add(grid[checkX, checkY, checkZ]); //Adds to the neighbours list.
                    }
                }
            }
        return neighboringNodes;
}

From that code it seems mostly Lists and HashSets get created. You might want to reuse them and clear them in-between instead of creating them anew every time. Just pass a list into GetNeightboringNodes to be cleared and populated in there. That should already reduce the GC Allocs significantly.

Also the Call Stacks feature of the Profiler and Deep Profiling might help you further identify allocations in this code.

1 Like

Thanks, changing the lists seemed to reduce the garbage collection a lot. I went in and deep profiled as you suggested, however everything excluding debug seems to read 0.0%, I’m assuming this because they’re very small %'s and there’s just a lot of them, so I’d need to find a way to reduce the number of them being used correct?

It seems calling .Contains on the lists is the primary source of CPU usage now, leading to over 1000 ms sometimes. Are there any alternatives to .Contains that would run more efficiently?

7896241--1005655--upload_2022-2-14_13-16-2.png

Are you looking at Raw Hierarchy? Hierarchy might make more sense to see the cost for the contains calls added up. Deep Profiling might inflate the apparent cost of methods based on how many subsamples occur under it, as each sample (e.g. each automatically sampled managed method) comes with a ~10nanosecond overhead. But this seems like more than just deep profiling overhead.

Maybe use a different container or a different container additionally to the Lists of you need to do Contains checks. Like a HashSet. Just make sure to profile the before and after, both in deep profiling but especially without it and then compare these profile captures in the Profile Analyzer package to see if it made as much of a positive impact as you’d hoped.

Your “GetNeightboringNodes” function is completely unnecessary. Extract the code block that loops over all neighbouring nodes into a function and just call it 27 times. No need to add those nodes to the list that you discard right away afterwards. However, that is the least of your problems.

Change “openList” type from List to PriorityQueue. That way you don’t have to go through the whole array every iteration of the loop.

Your grid seems to be finite, and you’re storing it in a multidimensional array. Instead of storing “closedList” in a HashSet, allocate a single bool array “new bool[grid.GetLength(0), grid.GetLength(1), grid.GetLength(2)]” which you can use to store whether the node was visited. Not only the check will be faster, but you’ll also stop allocating every time you add a node to the hashset.

1 Like

7900240--1006408--upload_2022-2-15_21-45-55.png

I believe I’ve implemented both your GetNeighboringNodes suggestion and your Bool Array suggestion, however I’m not really sure what a priority queue is. Is this a custom datatype that I need to make?

I also switched openList to a dictionary hoping that switching from O(n) to O(1) would improve speed. I then added a bool “isInOpenList” to my nodes so that I wouldn’t have to use ContainsKey. I think it did, but adding/removing from the dictionary seems to generate a decent amount of GC still. So I’d like to go with your initial suggestion of PriorityQueue if you can explain it more.

    public List<Node> FindPath(Vector3 a_StartPos, Vector3 a_TargetPosition)
    {
        Node startNode = grid.NodeFromWorldPosition(a_StartPos);
        Node targetNode = grid.NodeFromWorldPosition(a_TargetPosition);
        bool[,,] gridBool = new bool[grid.gridSizeX, grid.gridSizeY, grid.gridSizeZ];

        if (!targetNode.isWall)
        {
            return null;
        }

        startNode.gCost = 0;

        foreach(KeyValuePair<int, Node> node in openDictionary)
        {
            node.Value.isInOpenList = false;
        }

        openDictionary.Clear();
        openDictionary.Add(startNode.id, startNode);
        startNode.isInOpenList = true;

        while (openDictionary.Count > 0)
        {
            Node currentNode = openDictionary.First().Value;
            foreach(KeyValuePair<int, Node> k in openDictionary)
            {
                if(k.Value.fCost < currentNode.fCost || k.Value.fCost == currentNode.fCost && k.Value.hCost < currentNode.hCost)
                {
                    currentNode = k.Value;
                }
            }
            openDictionary.Remove(currentNode.id);
            currentNode.isInOpenList = false;
            gridBool[currentNode.gridX, currentNode.gridY, currentNode.gridZ] = true;

            if (currentNode == targetNode)
            {
                finalPath.Clear();
                Node curNode = targetNode;

                while (curNode != startNode)
                {
                    finalPath.Add(curNode);
                    curNode = curNode.parent;
                }

                finalPath.Reverse();

                GetFinalPath(startNode, targetNode);
                return finalPath;
            }

            for (int x = -1; x <= 1; x++)
            {
                for (int y = -1; y <= 1; y++)
                {
                    for (int z = -1; z <= 1; z++)
                    {
                        if (x == 0 && y == 0 && z == 0)
                        {
                            continue;
                        }

                        int checkX = currentNode.gridX + x;
                        int checkY = currentNode.gridY + y;
                        int checkZ = currentNode.gridZ + z;

                        Node neighboringNode = null;
                        if (checkX >= 0 && checkX < grid.gridSizeX && checkY >= 0 && checkY < grid.gridSizeY && checkZ >= 0 && checkZ < grid.gridSizeZ)
                        {
                            neighboringNode = grid.grid[checkX, checkY, checkZ];
                        }
                        else
                        {
                            continue;
                        }

                        if ((!neighboringNode.isWall && !neighboringNode.isOccupied) || gridBool[neighboringNode.gridX, neighboringNode.gridY, neighboringNode.gridZ])
                        {
                            continue;
                        }

                        int moveCost = currentNode.gCost + GetGetManhattenDistance(currentNode, targetNode);

                        bool b = neighboringNode.isInOpenList;
                        if (moveCost < neighboringNode.fCost || !b)
                        {
                            neighboringNode.gCost = moveCost;
                            neighboringNode.hCost = GetGetManhattenDistance(neighboringNode, targetNode);
                            neighboringNode.parent = currentNode;

                            if (!b)
                            {
                                openDictionary.Add(neighboringNode.id, neighboringNode);
                                neighboringNode.isInOpenList = true;
                            }
                        }
                    }
                }
            }
        }
        return null;
    }

Sorry, I should have been more clear. It’s a type of data structure where you can insert items in logarithmic time, and it automatically sorts them for you for fast retrieval of the “smallest” element. SortedSet is an example of such data structure.

See https://en.wikipedia.org/wiki/Priority_queue.

How much data are you searching through here? What’s the size of your grid?

Even with non-ideal choices for data structures, 300ms is quite a lot! Is this some kind of point graph that covers the level?

Hi, here is how I’m generating my grid. Currently I’m testing with a 25x15x25 size grid. With a nodeRadius of .25 and a nodeDistance of .1

If I debug the for loops I end up getting
x = 75000
y = 1500
z = 50

    void CreateGrid()
    {
        grid = new Node[gridSizeX, gridSizeY, gridSizeZ];
        Vector3 bottomLeft = transform.position - Vector3.right * gridWorldSize.x / 2 - Vector3.forward * gridWorldSize.z / 2 - Vector3.up * gridWorldSize.y / 2;
        int i = 0;
        int _z = 0;
        int _x = 0;
        int _y = 0;
        for (int z = 0; z < gridSizeZ; z++)
        {
            _z++;
            for (int y = 0; y < gridSizeY; y++)
            {
                _y++;
                for (int x = 0; x < gridSizeX; x++)
                {
                    _x++;
                    Vector3 worldPosition = bottomLeft + Vector3.right * (x * nodeDiameter + nodeRadius) + Vector3.forward * (z * nodeDiameter + nodeRadius) + Vector3.up * (y * nodeDiameter + nodeRadius);
                    bool wall = true;
                    if (Physics.CheckSphere(worldPosition, nodeRadius, wallMask))
                    {
                        wall = false;
                    }

                    grid[x, y, z] = new Node(i, wall, worldPosition, x, y, z, false);
                    i++;
                }
            }
        }
        Debug.Log(_x + " " + _y + " " + _z);
    }

If I debug a particularly nasty path, in the Pathfinding script above, I may see upwards of 40,000 iterations in the loop that grabs the neighboring nodes. This is when we hit those big lag spikes.

Yeah ooof. That grid has 75000150050 = 5.625 billion entries. You might want to rethink your approach.

A NavMesh (not the Unity implementation per se, but the general implementation) is able to massively speed up the speed of A* or other pathfinding algorithms in 2D by having you search through a mesh structure instead of a bunch of points. So if you have a big, nice 25*25 square of area, that’s two triangles to path through. If you have a point graph over that area, you have 625 points instead. Or 62500 if you’re sampling at 0.1 distances.

I don’t know quite what you’re using the third dimension for, but if that’s necessary, you could probably do something similar, where you have a data structure that’s able to take large, empty areas and have them be a single point instead of very many. A KD tree or some other space partitioning structure should enable you to chop quite a few zeroes off the end of how much data you’re looking through.

The third dimensions is because this is for flying enemies who can navigate in all directions, I wasn’t able to find anything that would do it for me which is why I set out to make this, but honestly if you know of a tool to handle this I’d be fine to implement that instead.

Here is a video :
https://www.youtube.com/watch?v=aacFu0JAacE

“where you have a data structure that’s able to take large, empty areas and have them be a single point”

This would be really good I think, but I’m not quite sure how to go about doing is there anything I can read/watch to learn?