[June 2024] Full ECS Stack Review

[June 2024] Full ECS Stack Review

Hi everyone,

If you aren’t familiar with this format, every so often I make a big post discussing my thoughts on the Unity ECS ecosystem and all of its parts. This feedback is specifically addressed to the folks at Unity.

My last review was in February. I have decided to make these reviews more frequent and a little smaller in scope. For this review, I’ll be examining the current state of Entities 1.2.3 and 1.3.0-exp.1.

Also like last time, I will limit myself to one positive and one negative of each feature. But in addition, I will also pose a question. The questions are not rhetorical. I truly do not know the answers to these things and would appreciate answers.

If you are looking for a more exhaustive list of critical feedback, I have updated my wishlist here.

General Feeling

Last time, I complained that Entities 1.X feels abandoned. I don’t feel that way anymore. That’s a victory!

However, that feeling is replaced by the feeling that the DOTS teams are severely understaffed and overloaded by process overhead, because some areas are seeing extremely slow progression, even for very low-hanging fruits.

Last time, I also called out the trend of more and more people relying on internal access and forking the packages. I still believe this to be a big issue, but I would like to bring attention to something a little different.

DOTS has a steep learning curve. People still complain about resources being out of date and things always changing, even though the API has been “stable” for over a year and a half. The problem is that even though the packages have not seen a lot of changes, the recommended patterns have still changed a lot, and are continuing to do so.

While it is true that most people learn best from patterns, the most stable patterns are built on stable rules. And in some areas, the rules are not well-defined right now. Here’s an example of two ways to do the same thing, yet which is better depends on what the rule is, a rule I truly do not know right now:

[BurstCompile]
partial struct JobA : IJobEntity
{
    public EntityCommandBuffer.ParallelWriter ecb;

    public void Execute(Entity entity, [ChunkIndexInQuery] int chunkIndexInQuery, Health health)
    {
        if (health.health <= 0f)
        {
            ecb.DestroyEntity(chunkIndexInQuery, entity);
        }
    }
}

[BurstCompile]
partial struct JobB : IJobEntity
{
    public EntityCommandBuffer.ParallelWriter ecb;

    [NativeSetThreadIndex]
    public int threadIndex;

    public void Execute(Entity entity, Health health)
    {
        if (health.health <= 0f)
        {
            ecb.DestroyEntity(threadIndex, entity);
        }
    }
}

Here’s another challenge, what is the most correct way to implement the following method?

public static void AddComponentValues<T>(NativeParallelHashMap<Entity, T> hashmapWrittenToByParallelWriter, EntityManager entityManager)
    where T : unmanaged, IComponentData
{
    // ... How to write this correctly?
}

Both of these challenges are actually touching on the same technical detail, and boil down to the same question in my mind. I felt I used to know the answers to these, back when the ENTITY_STORE_V1 was the only implementation. But now, I remain confused about this, and I also wonder how many other assumptions I’ve falsely made about the rules. I have seen multiple Unity staff complain about how users will quickly come to rely on unintended behavior. But without clearly-defined rules around these things (and not just the rule of “follow this pattern”), I don’t think you are without fault. You are providing a product to people who solve problems any way they know how. If you can’t define clear rules within which they can solve their problem, they are going to explore waters you don’t want them to explore.

This is why I wanted to throw questions into this format, to help facilitate the discussion.

My Biggest Pain Points (And a Positive Compared to Last Time)

Currently, I am facing two engine bugs. One is an editor soft-lock related to audio and baking. The other is a scaling issue due to a bad resource type and component stride being bound to shaders using bone weights.

In my last review, I complained about the breaking of entity ID determinism. I’m still very concerned about this from a rules and debugging standpoint and have some unanswered questions, but I can acknowledge the desire to move away from this if it is a huge deal for Game Object / ECS unification.

I was very nervous about upgrading to newer entities using ENTITY_STORE_V1, because I believed it wasn’t going to be officially supported, and that I would encounter a bug that snuck past a lack of internal testing. To my surprise, ENTITY_STORE_V1 is stable, even on Entities 1.3.0-exp.1. Thank you!

I don’t mind you taking your time to define the rules as long as you keep this around until you do. Similarly, I don’t mind if you take your time to figure out an alternative means of source generator mechanisms for doing component aggregation abstraction, as long as you keep IAspect around until you have a solution. My fear is that you will remove these things without any kind of replacement I can work with. I don’t mind if it becomes more work for me. I just need to be able to provide users an easy API and good performance.

Anyways, time to dive into the features!

Mathematics

The Good: Still my favorite math library.

The Bad: This package is missing a lot of types and utilities for booleans and small integers (byte, short, ect).

Question: How does Burst detect the natively-understood types? If I were to modify the package, which modifications will and won’t break Burst?

Burst

The Good: You guys are still the lifeblood of DOTS. I would not be here without you. And I appreciate you prioritizing fixes!

The Bad: I am currently facing bad-image exceptions due to default interface methods. It is inconsistent though. Sometimes Burst compiles them fine, and sometimes it fails to create a build because of it.

Question: Is there any way to determine which UnityEngine methods or properties can be called from within Burst (on the main thread) without just trying it out and seeing if Burst successfully compiles?

Collections

The Good: I don’t know if I said this before, but all the safety checks you have even in the unsafe containers are amazing, and have accelerated my ability to debug countless issues due to the stack traces those checks produce.

The Bad: My complaint is the same as last time. It might be more of a job thing, but I really want to be able to have a dynamic number of containers inside some jobs. Not like thousands, mind you. But a good example of this would be an array of DynamicComponentTypeHandles.

Question: Is it possible to create a custom allocator that works with AllocatorManager and can rewind safety handles inside of a job?

Jobs

The Good: I’ve seen even more effort at making jobs faster, even in the LTS!

The Bad: While jobs are fast on worker threads, scheduling them from the main thread has significant overhead still. And in addition, the main thread always needs to know the number of jobs to schedule in advance, which in some situations can result in sync points. The sync points in Unity Physics are an example of this.

Question: Is there a way to augment the string associated with the job in the profiler rather than rely on child profiler samples?

Entities Baking

The Good: In Entities 1.2.3, I get warnings about memory leaks in import workers, and I know they are my fault. This reporting is working correctly, which used to not be the case.

The Bad: It is nearly impossible to correctly bake a component on an entity dependent on whether or not it is referenced in a list of another authoring component. I’ve only figured out how to do this when the authoring component with the list is an ancestor.

Question: What are the rules for hashes used for BlobAssetStore?

Entities Editor

The Good: I see bugfixes related to this in pretty much every release. I appreciate how much care you put into this to make this good.

The Bad: Inspectors and windows are not customizable.

Question: Is there a way to see the index of an entity within its chunk?

Entities Runtime

The Good: I used to complain about IEnableableComponent and queries being a mess and breaking in a bunch of places. While there are still a few edge-cases, it has gotten a lot better!

The Bad: Source generators are a real problem in terms of extensibility and feature sets. The tight coupling of IJobEntity and the system that schedules it, the whole system rewriting mechanism and inability to aggregate type handles, the inability to access IJobEntity’s query, and the fact that I can’t use the IJobEntity’s underlying IJobChunk to work with cached chunk data from an earlier pass have all been major annoyances. I have an example of a gap I found in this API for something I believe should be simple to do. Given this singleton and job, can you write the system that schedules this job without creating a sync point on AnEnableableComponent?

struct FilteredSumListSingleton : IComponentData
{
    public NativeList<int> sums;
}

[BurstCompile]
[WithAll(typeof(AnEnableableComponent))]
partial struct FilteredSumToListJob : IJobEntity
{
    [NativeDisableParallelForRestriction] public NativeArray<int> sums;

    public void Execute([EntityIndexInQuery] int i, in ComponentA a, in ComponentB b) => sums[i] = a.a + b.b;
}

Question: Is ENTITY_STORE_V1 planned to stay for the remainder of Entities 1.X?

Scenes

The Good: Adding UnityObjectRef serialization support is a big deal, and can really clean up a lot of problems with prefabs.

The Bad: More of a baking issue, but in the latest Entities 1.3.0-exp.1, there was a change to make the internal capacity of LinkedEntityGroup be 0 instead of 1, however standalone prefab entities still receive LinkedEntityGroup, resulting in heap allocations when instantiated. I had to fix this in a baking system.

Question: Is there a way to control (wait for some other signal) the activation of a subscene (merging it into the target world) after it has loaded?

Transforms

The Good: The Child buffer is no longer a chunk space hog!

The Bad: There is a race condition on change filters when updating the hierarchy that results in some subtrees having their matrices updated unnecessarily. This is at least a performance issue, if not a functional issue (though at the very least it won’t result in crashes or invalid version numbers).

Question: I’ve heard there’s a new unified transform system coming. Will this new system support writing to the transforms (including transforms that might be children) in a parallel IJobEntity job?

Graphics

The Good: I appreciate bugfixes being proposed on the forums landing into releases, even when there doesn’t seem to be any other developments happening.

The Bad: Since last time, the only improvement I’ve seen to the package was a few bugfixes and the baking overhaul, the latter being credited to a DOTS generalist. In the meantime, I ended up pulling the trigger and rewriting LODs to be faster, lighter, and support LOD crossfades. And most recently, I have developed an even more optimal LOD algorithm that allows packing the entire LOD Group into a single entity. I’d love for some of this tech to go into the official package and want to have conversations, but so far no one has reached out to me about these things.

Question: While implementing the packed LOD Group algorithm, I noticed buried deep in the Entities Graphics code you mask out the upper 8 bits of the BatchMeshID but not the BatchMaterialID. What are the rules for these IDs that allow you to do this masking at all?

Physics

The Good: I’ve really come to appreciate your constraint solver, especially with how you are able to handle extremely stiff springs. And I also want to call out that this package saw a good amount of development since last time, enough that I would dare consider it a healthy pace.

The Bad: Last time I complained about architecture, and while I still think that is true, I want to call attention to something a little more specific I’ve observed since last time. The Physics package is extremely wasteful of ECS chunk memory. There’s multiple dynamic buffers of random capacities, bools that aren’t bit-packed, enums that use 4 bytes, optional values, and suboptimal field orderings. The acceleration structures do a much better job of packing this data down, but I really think with some C# properties you could reduce your ECS footprint while still retaining an accessible API.

Question: Based on this article: Solver2D :: Box2D

I have been trying to figure out how to describe what Unity Physics is using. Is this following description wrong in any way?

Character Controller

The Good: I finally found time to dig into the code and see how it all works. It is quite impressive how many cases you cover that make it as general-purpose as it is for something that is extremely subjective!

The Bad: This package doesn’t seem to be synced yet with the rest of the packages.

Question: Do you have any thoughts on how one might modify this controller to do a spring-based hover over the ground?

NetCode

The Good: This is anecdotal, but I have heard some praise about how with NetCode, if you get it working on a local device, there’s a very high chance of it working in a real networked situation (correctness-wise). I also want to call out that like Unity Physics, this is another package that has seen a good pace of development recently.

The Bad: I still think the workflows are too restrictive and overbearing. I’d like to see more things be natural (like transform hierarchies and ECS structural changes) and I would like to also see more things be overridable around synchronization.

Question: What optimization tips would you suggest for a game that has thousands of non-player entities that move based on player movement (targeting the player)? Dead-reckoning isn’t really an option here, and I don’t think you’d want these entities to be predicted. Is there anything else available?

DSPGraph

The Good: This package still works.

The Bad: I’m getting soft-locks in the editor upon domain reload when doing audio things. I’m still trying to track down if this involves DSP Graph or is exclusive to calling AudioClip.GetData() during baking (which no longer crashes, but still seems to be a requirement for this other issue). Using a sampling profiler, the soft-lock seems to be deep in C++ engine code related to audio.

Question: Is there a different API I should be using to read audio clip assets during baking?

Final Thoughts

User trust is still a concern right now, so you’ll have to forgive me if I am still extremely skeptical and fearing the worst. But I definitely have a little bit more hope than I did in my last review. I think there’s still a major uphill battle ahead before a unified GameObjects/ECS is ready for the masses.

Please feel free to reach out to me via PM in these forums, Discord, or publicly in this thread. I want to discuss, learn, understand, and maybe even teach. I hope you find these posts helpful, and that the questions I bring will stir up some meaningful internal conversations.

For everyone who did decide to read this wall of text (unity or not), thanks for making it all the way to the bottom. If you have comments or questions, feel free to reply to this thread and share them!

35 Likes

Bit of a vendetta of mine. I forked physics over a year ago and removed all sync points and this change doubles the performance for me (*your results my vary).

Not only that, but they are forced to be dynamic transforms which kind of nullifies the whole transform flag stuff - something else i changed in a fork

6 Likes

This isn’t a big deal for me personally, since I have a smaller root transform footprint and don’t typically use non-transform prefab entities. But I did look at the changes in your fork, and I am curious. How does you change impact prefabs that only have a single GameObject but have a baker that calls CreateAdditionalEntity()?

Good observation, it didn’t. Not sure how I never ran into this issue but I don’t use CreateAdditionalEntity that often.

However, I think I’ve fixed this case now as well (though I need to test more and check any other case I may have overlooked.)

1 Like

Looking forward to another review from you @DreamingImLatios soon! Your content here has been extremely useful for those of us reviewing whether to move to uECS.

1 Like

My next review will be shortly after Entities 1.4 preview is released. If that release is uneventful, I’ll mostly be talking about the engine, because that’s the only area that’s really gotten any big innovation (and a lot of it was unannounced, so I actually missed most of it during my last review). And all of the next-gen stuff is still speculative from my perspective, which aside from having made it way easier to prioritize my own work, isn’t that useful from a feedback perspective. (I still appreciate what was shared though.)

In the meantime, I really hope some staff at Unity eventually get around to answering at least some of my questions. I don’t expect any individual to know the answers to all of them. But having answers would definitely help me be more accurate in my next review.

5 Likes

Well, this post is definitely an outstanding piece of work!

Thank you very much for it!

To be honest, if I had seen it two weeks ago, I definitely would not have spent these last two weeks studying ESC (I am interested in it exclusively in the context of NetcodeForEntity), judging by the questions and problems that you raised in this post at the moment ESC is not a system for the end game developer.

ECS from Unity is some kind of very crude and buggy billet for framework developers, who in turn, with titanic efforts, have a chance to make some kind of working ESC-based system.

1 Like