I’m working on an event-based Ability-System framework which is a kind of dynamic entity management.
like: Add, Remove, Update Components on entities (Caster, Hit-Entity, Ability-Entity…)
theoretically, the framework is very scalable and flexible.
we tried to plan many abilities like projectiles, mind control, elemental interactions…
everything is theoretically feasible and quite easy to configure especially for artists (non-programmers).
the only issues we are facing are all related to structural changes (performance and min 1 frame delay).
the new enableable components feature is THE way we where looking for to fix the frame delay problem especially for networked games which are supposed to be handled by default.
but this feature comes with a bigger performance issue and limitations.
is unity planning to overcome these limitations?
maximum of 128 entities per chunk.
not possible to make existing components enableable.
have enableable components on archetypes where its not required.
…
Bigger chunks have greater economy of scale.
AFAIK, job scheduling is already suffering in some cases due to the strict 16k limit.
Adding the 128 entities limit will make the situation even worse. Imagine having hundreds of chunks with only a few entities that have enabled CD and each job running with its separate chunk.
Tag components are zero-size, so they are applied directly to the chunk instead of the entity, making them very efficient but also requiring moving entities between chunks and applying structural changes.
Enableable components, on the other hand, can be applied per entity, making it very efficient and requiring no structural changes. So systems can benefit from some sort of archetype changes within the same frame and without waiting for any CommandBufferSystemGroup.
As we all know, ECS entities define their context through components composition (EntityArchetypes) rather than instances and types.
Some EntityArchetypes may require some components to always be there while they exist, but other EntityArchetypes may require the same components for a few frames only.
eg:
Some projectiles can travel in a straight line until they reach their maximum distance or they hit something and get destroyed. (requires long term Velocity…)
some other projectiles can travel in a straight line until they reach their maximum distance or they hit something and continue to live by sticking to the hit entity.
(the projectile is now a child of the hit entity and does not require moving using the Velocity CD anymore)
Sadly I have library’s that were built to be as performant as possible by minimizing archetype size that now run significantly slow because of this.
For example my effect/stat/requirement library.
10,000,000 entities idle time without filters triggering.
I was able to pack 230-460 entities per chunk depending on what the effect/conditions were tracking.
0.51 - 2.05ms
1.0 - 4.34ms
I actually tested this in 0.51 with 100,000,000 and it scaled linearly to 20ms.
I ended up rewriting the library to just utilize more of the chunk so it’s not as bad now, but it’s still worse than 0.51 performance levels but i can deal with it.
You are scheduling 1 job per chunk? Why not use one of the parallel job types which schedule a job per worker thread and let them work things out?
If the entities are sparse, you are going to cache miss regardless if they are in the same chunk or different chunks.
Sorry. I should have been more specific. Wouldn’t an additional enableable tag component be sufficient?
I think you are misusing the term EntityArchetype. An EntityArchetype is just the full set of components an entity can be attached with, ignoring the enabled state of enabled components. If an entity needs certain components for some frames and not others, unless you use enabled components, the entity will have to switch archetypes. An archetype itself is a static thing. It is the entities that can switch one archetype they belong to.
That’s a sign that your job might be too granularized (the processor ALUs are under-utilized versus the number of times you are loading and storing the data). Your performance is directly proportional to the number of cache misses and nothing else. That’s a red flag. You used to only see it with compute shaders, but we’re getting to the point where it can happen on the CPU now too.
Anyways, unless you are working on the same project as @Opeth001 there’s no guarantee that max chunk capacity is actually the performance bottleneck. That’s why I asked. I wanted profiling numbers to get a feel for how big of an impact it is, or if there may be a completely different direction that offers better performance.