Designing Coroutines for Performance and Expediency

Hi everyone.

Since the Unity API is not thread-safe (I believe that’s because most things in Unity have a managed-code component and a native-code component), I’ve been thinking about coroutines.

Now, one of the big problems with coroutines is balancing them. If you yield too often, the coroutine will take forever to complete (e.g. a million calculations, which can be done almost instantly in a regular method, will take four and a half hours(!) if you yield after every calculation at 60FPS). On the other hand, if you yield too infrequently, the coroutine damages your performance, as it holds up moving to the next frame.

Since, in my heart of hearts, I really just want threads ( :stuck_out_tongue: ), I’ve been playing with some code that allocates a time-slice to a coroutine. The idea is we figure out how long the frame takes (or rather, assume this frame will take about as long as the last frame and use Time.deltaTime), and then allow the Coroutine to take some specified portion of that (e.g. 10%).

And in this code it pretty much works, because although I set the startTime in Update(), Unity’s order of execution is to resume coroutines immediately after Update(). However, in this example I have only one Update(), and only one coroutine. With more Updates and more coroutines, there may end up being some time between setting startTime and resuming the coroutine, which would cause the code to not work as intended.

public class CoroutineTest : MonoBehaviour {
   
    public float timeSlice = 0.1f;    // How much of a frame do we let the Coroutine have?
    public float targetFramerate = 60f;    // What is our target framerate?
   
    private float startTime;    // When was Update called?
    private float allowedTime;    // How much time are we letting our Coroutine run?
    private float targetTime;    // How long does one frame at target framerate last?
   
    void Start() {
        print ("Start");
        /* We need time differences WITHIN a frame so realtimeSinceStartup is our only
         * option in Time's members. Per documentation: May not work well for all
         * platforms(?) */
        startTime = Time.realtimeSinceStartup;
        allowedTime = timeSlice / targetFramerate;
        targetTime = 1 / targetFramerate;
        StartCoroutine (MyCoroutine ());
    }
   
    void Update() {
        print("New frame: " + Time.frameCount);
        /* This bit takes some explanation. If we only wanted the Coroutine to take 10% of
         * each frame, we would just do frameTime * timeSlice. But we actually want the
         * Coroutine to take more of the frame if our framerate is above our target
         * framerate (i.e. we have time to spare), and less of the frame if our framerate
         * is below our target framerate (i.e. performance is at a premium). So
         * (targetTime / frameTime) will be 1 at our target framerate, >1 at higher
         * framerates, and <1 at lower framerates, and acts as a multiplier for
         * timeSlice. */
        float frameTime = Time.deltaTime;
        allowedTime = frameTime * timeSlice * (targetTime / frameTime);

        // What time is it now, just before we resume our Coroutine?
        startTime = Time.realtimeSinceStartup;
    }
   
    IEnumerator MyCoroutine() {
        int passes = 0;
        int target = 1000000;

        while(passes < target) {
            /* Do processing here. I've chosen Mathf.Log since it's a simple yet expensive
             * call that, under normal circumstances, you don't want to do too often. Just
             * for demonstration ;) */
            float result = Mathf.Log(passes, 2);
            print("Log base 2 of " + passes + " = " + result);
            passes++;

            // yield if the Coroutine has taken longer than it's allowed
            if(Time.realtimeSinceStartup - startTime > allowedTime) {
                yield return null;
            }
        }
    }
}

So, does anyone have any ideas for improving this?

What benefit do you think you’re getting from the coroutine? You’re using the Update method anyway, why not just do the work in there? What is the problem you’re solving, and why do you think that coroutines in particular are the solution?

Splitting work into time chunks is a solution for one thing. Coroutines are a solution to another. Sometimes they overlap, but that is not their purpose.

Coroutines aren’t a replacement for threads and shouldn’t be treated as such. If you’ve got enough data processing to do that threads are beneficial then design your data structures so that you can actually use threads. Ie: get the bulk data operations out of your scene.

That aside… When I did something like this in the past I had the coroutine manage itself. This was a while ago and not a hack I intended to repeat, so I don’t remember it in great detail. Take note straight away that Unity’s Time class probably isn’t a good cantidate for this, though, as it’s really not designed to manage sub-frame time intervals. Check this out in the docs:

I may have used System.Diagnostics.Stopwatch in my hack. Whatever it is, you need a high resolution timer. Each iteration I’d have reset the stopwatch (or whatever my timer was) and started doing work. I’d periodically check if the time was expired. If it was I’d yield, then reset the timer upon returning. I could, if I had wanted, have done some analysis and modified the time budget, but I’d be really questioning my design if I got in a situation where that was necessary - that strikes me as a low level mitigation of a high level problem (ie: if I don’t know enough to pick a time budget then I also don’t know how long it’s going to take, and that’s probably a problem more worth my time to solve).

On the topic of my hack, it was only necessary because we were moving functionality that was previously an Edit-time baking process into something done at load time. If that had been the plan from the start we’d have come up with a more elegant solution in the first place, which is generally the approach I’d suggest.

2 Likes

This will only really be neccessary if you need to talk to the Unity API in your threaded job. It’s probably a lot faser (and less headachey) to start a real thread and give it the information you need it to do the maths on. I believe that NavMeshAgents does this internally for pathfinding.

But if you must speak with the engine, then this seems like a nice enough workaround.

Ah, that’s embarrassing. I can see I’ve made some fundamental mistakes here. Not least of which is that I was somehow under the impression that the Unity API not being thread-safe meant that you can’t use threads at all in a Unity project. I guess it just means you can’t make calls to the API in a thread.

I have two coroutines in my main project right now. One of them is already a good candidate for a thread, because it doesn’t need to access the API at all. The other one does (painting a splatmap on a terrain and needing to access TerrainData.GetSteepness() at each point - maybe I could just write my own steepness-calculating function to break that reliance on the API).

Thanks for the advice.

1 Like

Yeah, this is something that many people seem to think. It does make threading our code more difficult, but (for reasons discussed elsewhere at length) making us break ongoing threaded work out of the scene isn’t necessarily a bad thing overall.

Don’t be. It’s far better to ask things and learn early that you’ve headed in the wrong direction than to persist in solitude and waste far more time and effort on it.

Sounds like you’re headed the right way now. I personally do my best to design purpose-specific algorithms and data structures for my “heavy lifting” code. A part of that is considering how the computer uses data, as opposed to how objects represent concepts. So far, even for some pretty crazy stuff I’ve done, that’s been enough to get the job done without having to resort to threads.

The other thing of course is to use the Update method (and friends) as little as possible. If a script is dormant, disable it. Use events to enable things only when they’re needed. Use aggregate controllers where possible to perform calculations in bulk rather than having hundreds or thousands of scripts doing the same things individually. Of course only consider those things where you’re doing real heavy lifting. Don’t optimise something that’s only going to be a negligible part of your game cycle anyway, it’s probably a waste of time.

Reading your initial post my thought was total waste of time, for anything other then Unity’s terrain engine. Since you mention you are working with terrain, it’s not a bad idea. I use something similar, but without the timer, in my infinite procedural terrain generator.

Reason for this trickery is some of the terrain API calls by themselves take more time then is acceptable for a frame.

Like is pointed out, it’s an ugly hack. But sometimes ugly hacks are all that can be done.

If you need to do a lot of calculations really fast, and are running out of CPU time, there’s Compute Shaders. I haven’t used them myself, but the idea is to do big parallel algorithms on the GPU.

They require a bunch of additional knowledge, and since it’s GPU you should probably feed it SIMD stuff, but be aware of their existence in case that becomes usefull.