Is there a point to using burst compiler for a single job?

I’m just dipping my toes into DOTS so apologies if I’m completely out to sea here.

I’ve implemented an algorithm called priority flood (https://rbarnes.org/sci/2014_depressions.pdf) that fills depressions in a generated terrain to ensure that every point drains to the sea.

It runs in 4 seconds on a 512x512 grid and generates a ton of garbage. I’m sure this code could be optimized and I’m not sure if the priority queue I’m using is working that well, but I figured if I could implement it in DOTS it might help and serve as a learning experience.

My question is: since this algorithm starts with a coastline and fills inland, meaning a point cannot be marked as a depression until the entire frontier has been evaluated, I can’t see a way to break this up into multiple jobs. Does the “performance by default” method of writing code also speed up an expensive algorithm run a single time? Or does it only provide a benefit if you’re running an algorithm hundreds and thousands of times?

non DOTS code below:

public class PriorityFloodSingleThread
{
    int[] edge;
    int[] dem;
    Dictionary<int, bool> closed;
    Dictionary<int, float> heights;


    SimplePriorityQueue<int> open = new SimplePriorityQueue<int>();
    Queue<int> pit = new Queue<int>();
    World.MapGen map;

    public PriorityFloodSingleThread(int[] dem, int[] edge) {
        this.edge = edge;
        this.dem = dem;
        map = World.MapGen.Instance;

        closed = new Dictionary<int, bool>(dem.Length);
        heights = new Dictionary<int, float>(dem.Length);

        foreach (var index in dem) {
            closed[index] = false;
        }

        foreach (var index in dem) {
            heights[index] = map.Height(index);
        }
    }

    public Dictionary<int,float> Flood() {

        foreach (var c in edge) {
            open.Enqueue(c, map.Height(c));
            closed[c] = true;
        }

        float pitTop = -1f;

        while (open.Count > 0 || pit.Count > 0) {
            int c;
           
            if((pit.Count > 0 && open.Count > 0) && pit.Peek() == open.First) {
                pit.Dequeue();
                c = open.Dequeue();
                pitTop = -1f;
            }
            else if(pit.Count > 0) {
                c = pit.Dequeue();
                if(pitTop == -1f) {
                    pitTop = map.Height(c);
                }
            }
            else {
                c = open.Dequeue();
                pitTop = -1f;
            }

            foreach (var n in map.GetNeighbors(c)) {
                if (map.waterMap[n]) {
                    closed[n] = true;
                }

                if (closed[n]) continue;
                closed[n] = true;

                if (heights[n] <= heights[c]) {
                    heights[n] = heights[c] + .0001f;
                    pit.Enqueue(n);
                }
                else {
                    open.Enqueue(n, heights[n]);
                }
            }
        }

        return heights;
    }
}

Yes, using Dots will improve the performance of your code significantly even if you don’t use multiple cores (IJobParallelFor). Burst will optimize your code and generate better machine code instructions, and Job system will run the code on a Worker thread (even if it’s single threaded, your code doesn’t need to block the main thread)

2 Likes

To get best of burst, you need convert managed data types to unmanaged.
For example in your case Dictionary could turn into NativeHasMap.
Classes conversion to struts.
And generally switching mind from OOP to DOD paradigm.

Burst can gain you even 10x performance gain, if code is burstable.

Flood algorithms may be difficult to be put on multithreading. Depending on logic. But bursting it, will definatelly gives you massive gain.

Also, as mentioned above, you could schedule single threaded job, which would allow your calculation to be computer off main thread, meaning on another thread. While still single threaded.

2 Likes

You can run a single threaded job for smaller items (it’s quicker to setup) and the multi-thread for longer running jobs.

As for linking things together on multi-thread rather than completing on run (and maybe wasting precious time), you can add the handle as a dependency when scheduling the new code. Then use Complete at the end to await completion.

It’s definitely blow your mind fast (as if using C++ code).

List of types here, including NativeQueue

Namespace Unity.Collections | Collections | 1.3.1

1 Like