I’ve read the ECS features in detail section of the documentation and want to see if my understanding of the data layout for entities/components is correct.
Chunks
Data is stored by Entity Archetype in 16kb chunks.
A chunk is arranged by component streams. So all of component A, followed by all of component B etc.
Is the chunk split up on creation such that the space for all component streams is already reserved? Like so:
[A, A, A, A, A, A][B, B, B, B, B, B]
even if there’s only one Entity? When you add an Entity, you just copy the component data straight to their relative index positions. This is pretty neat as allocating n entities of an archetype is virtually a no op.
Or do you compact the streams such that they occupy the memory like so
[A, A][B, B]
for two entities. If you add an Entity to this structure, then you have to move all component streams down the memory to get this [A, A, A][B, B, B]. I can’t imagine this would work anyway as it would involve re-indexing all the entities?
Entities
All entities are stored in a single EntityData struct array. Entity.index is the index into this array and EntityData provides a direct address to its Components. Is an Entity struct also stored in the chunk so it can refer back to the entities array? This is what EntityArray is generated from?
As a user can store Entity, am I right in assuming that the items in the entities array never change position? If you add 1000 entities and remove the first 999, that last entity is still going to be at the 1000th index?
Archetypes
If you add a new component to an Entity, it moves that Entity from its current chunk to a new chunk matching the new archetype. So there’ll be a chunk of memory for every possible archetype. If the user doesn’t specify a full archetype ahead of time, Unity will create one on demand along with a chunk for it. So adding a unique component to one Entity creates a new chunk just for that one entity.
ComponentDataArray
When we access the components via ComponentDataArray, are these direct pointers to the chunk data or are components copied into temp storage at the start of a system and back again at the end?
Looking at the source code for ComponentDataArray, the iterator jumps from chunk to chunk instead of being contiguous so I’d assume they’re direct pointers.
SharedComponentData (SCD)
An SCD is part of the archetype and each unique (by value, not type) instance of an SCD requires its own chunk. So an entity archetype will be split over as many chunks as there are unique SCDs.
The SCDs are stored in their own type arrays somewhere, not in the archetype chunk; the chunk just contains an index into that array.
Filtering
Does the SCD include metadata with references back to the chunks that match it?
So filtering on an SCD should be super quick depending on the Entity to unique SCD ratio.
If you had 1000 entities split up into units of 100 by SCD, then the filter would just search the 10 items in the SCD array and from that, can directly locate the relevant archetype chunks?
If I’m right on most of the above, then this structure is pretty damn awesome and I can see why creating and processing thousands of entities is so fast. I think I cleared up a lot of my own misunderstanding in the process of writing this out. Unless I’m completely wrong. ![]()