EntityManager batch API vs CommandBuffer

The other day, I was reading about how the EntityManager batch API (methods that take a query as an argument instead of a single entity) are a lot faster, due to being able to just update the chunk rather than move each entity between different chunks.

I decided to do a quick test on one of my systems that has to add multiple components to lots of entities at once, but can’t use a query. The code was originally using an EntityCommandBuffer, but the playback of the EntityCommandBuffer on the main thread was the slow part, so for simplicity it can be represented similarly as direct EntityManager calls on the main thread:

        var buffer = EntityManager.GetBuffer<BufferComponent>(bufferEntity);
        var entityArray = buffer.ToNativeArray(Allocator.Temp);
        for (int i = 0; i < entityArray.Length; i++)
        {
            EntityManager.AddComponent(entityArray[i], typeof(TagComponent));
            EntityManager.AddSharedComponentData<RenderMesh>(entityArray[i], renderMesh);
        }
        entityArray.Dispose();

My theory was, instead of adding both components to each entity in the loop, is it faster to add a temporary tag to each entity and then use a query and the batch API to add the components I actually care about, like so:

        var tempQuery = GetEntityQuery(
            ComponentType.ReadOnly<TempTagComponent>()
        );

        var buffer = EntityManager.GetBuffer<BufferComponent>(bufferEntity);
        var entityArray = buffer.ToNativeArray(Allocator.Temp);
        for (int i = 0; i < entityArray.Length; i++)
        {
            EntityManager.AddComponent(entityArray[i], typeof(TempTagComponent));
        }
        entityArray.Dispose();

        EntityManager.AddComponent(tempQuery, typeof(TagComponent));
        EntityManager.AddSharedComponentData<RenderMesh>(tempQuery, renderMesh);
        EntityManager.RemoveComponent(tempQuery, typeof(TempTagComponent));

In some (admittedly unscientific) tests, the original approach would take about 47ms of main thread time to operate on ~1650 entities.

To my shock, the revised approach takes only 11.5ms main thread time to operate on 1900 entities!

My theory here is that the loop approach is actually way worse than it looks - each entity is getting moved between chunks twice (once for each component add), as the compiler isn’t able to reason about “where it will end up” after the whole command buffer has been replayed.

By contrast, the second approach only results in one chunk move per entity despite having two extra operations per entity.

In light of all this, I have a few questions:

  • First, does this seem right? Or am I seeing some weird behaviour?
  • EntityManager has AddComponent and RemoveComponent methods that accept NativeArray - do these benefit from the same gains as the query method, even though they’re not operating on chunks?
  • If so, is there a plan to add an AddSharedComponentData override that accepts a NativeArray?
  • If not, is there a more intuitive workflow to do this kind of operation efficiently? Or are there planned improvements to EntityCommandBuffer to allow it to automatically reason about these kinds of optimisations? (I’m sure that’s not as simple as it sounds!)

EDIT: it seems to be about 2ms faster again using the available NativeArray API to add the TempTagComponent:

        var tempQuery = GetEntityQuery(
            ComponentType.ReadOnly<TempTagComponent>()
        );

        var buffer = EntityManager.GetBuffer<BufferComponent>(bufferEntity);
        var entityArray = buffer.Reinterpret<Entity>().ToNativeArray(Allocator.Temp);
        EntityManager.AddComponent(entityArray, typeof(TempUndoTree));
        entityArray.Dispose();

        EntityManager.AddComponent(tempQuery, typeof(TagComponent));
        EntityManager.AddSharedComponentData<RenderMesh>(tempQuery, renderMesh);
        EntityManager.RemoveComponent(tempQuery, typeof(TempTagComponent));
5 Likes

This is a very interesting discovery. I’m really hoping we eventually don’t pay a penalty for using command buffers. I know they will eventually be burst able, but it’s this kind of batching that I’m quite curious about Unity’s plans for.

I considered using native arrays to avoid command buffer at one point but am hesitant to work around issues unity had plans to fix anyway.

1 Like

Oh, man… Being the kind of person who frequently uses Command Buffers (and the Entity Manager in general) – learning that there are potential hidden downsides to them worries me about some of the code I’ve written. :c

This is seriously going to make me rethink using them at all honestly (at least in their current form). It might be high time that i made an add-component querying system so that way i don’t have to deal with these pitfalls (especially considering Entity Command Buffers notorious lack of easily determining if you’ve already added a component to an entity or not)… I can’t sacrifice my precious performance! Not like this! :hushed:

Does the Unity Team have any plans on addressing these kinds of issues in the future? I just want to know because i don’t fancy creating a system that will simply be outmoded by Unity’s official implementations when they come out. (IF they come out…).

Yes we are working on making EntityCommandBuffer significantly faster.

10 Likes

Is there any update on this front?

2 Likes

Bump. I would also love to know. :slight_smile: It’s been a while since @Joachim_Ante_1 's post. Have the optimizations he mentioned already gone in, or could there be more planned?

(I’m assuming Joachim meant something different from Burstable playback).

Playback of entitycommandbuffers is bursted. See release notes. Not quite sure which build this landed in.

3 Likes

Looks like it was in 0.9.0. That’s such great news, thanks to the team.

I’m now staring at my profiler, wondering what else I can do to speed up my ECB use. It’s performing quite well (scripts run at ~5ms/f, for 200,000 entities in my code) but my ECB playback on the main thread takes longer than any of my jobs. So I’m looking into how I might optimize my code for it.

Are there any more internal optimizations planned for ECBs in the near future? Useful to know what’s coming, before I rework my code. :slight_smile:

Thank you guys, again!

3 Likes

Hey sorry to bring up an old thread, but I’ve also ran into roadblocks with ECB use. I did find some info on payloads of commands. It’s hard to work out what the costs are though. This is during my chunks loading/unloading in my voxel game. I’ve optimized the rest but have been left with this. I’ve mainly been trying to minimize use of the entity command buffer but there’s a lot of data that must be set.

8106173--1049462--upload_2022-5-6_14-30-57.png