Performance-wise, is using ComponentDataFromEntity equivalent to using ForEach(TypeA)? If not, what are the differences in what happens behind the scenes?
ComponentDataFromEntity looks up the Entity index value in a special array which tells it which chunk the actual entity is located in along with the index in the chunk. Then it goes to that chunk, finds the memory offset for the component it is looking for, and lastly finds the data and returns it or sets it.
ForEach skips all of that by iterating the data directly in the chunks linearly.
Huh, I didn’t expect that. Is it then better to use ForEach if possible, if you can use it for your use case?
I pretty much moved all my code in my last project to ComponentDataFromEntity in IJobChunks thinking THAT was the optimal way of handling it, since the Physics samples tend to use jobchunk. Was that actually huirting my project? Yieks.
Yes. Entities.ForEach is optimal if it handles the use case. Internally, it generates an IJobChunk using ArchetypeChunk.GetNativeArray(ArchetypeChunkComponentType), which returns the array of components within the chunk to directly iterate. You can use this API in your own IJobChunk and get the same performance.
Does that apply to using ArchetypeChunk.GetNativeArray for nested loops as well? Entities.ForEach for the outer loop and chunk iteration for the inner (instead of EntityQuery.ToComponentArray for the inner)
What is optimal here depends on your chunk occupancy. If you have good occupancy, then iterating chunks in the inner loop can be fast. However, in most cases it is cheaper to copy the data into a single packed array that you can iterate n number of times over.
Unless your data is small, almost always there’s a better alternative than iterating over both outer and inner arrays.
Thanks. Two further questions:
- Could you explain what it takes for there to be good chunk occupancy?
- I have many cases where this is necessary - for example, checking the distance of every soldier from every enemy. Is this still fine?
-
This is not an easy topic for me to explain intuitively at the moment, but basically every chunk has a capacity for the number of entities it can hold and you want those chunks to be full or mostly full. You also don’t want the chunk capacity to be low (too many components and buffers) which cause only 2 or 4 entities to fit in a chunk.
-
Again, if you have a low number of entities, it makes sense to do this in a single-threaded algorithm. If you have a larger number of entities, you want to use a different pattern instead. Physics broadphase algorithms are quite good at this and can be much more useful than just physics. I wrote a custom multibox SAP which I use for all collision detection as well as AI FOV visibility tests, such as scanning for enemies or interesting points. The project is completely public if you feel like playing around with it or borrowing from it. GitHub - Dreaming381/lsss-wip: Latios Space Shooter Sample - an open Unity DOTS Project using the Latios Framework
Thank you, I think I understand this topic better. I guess it’s chunk iteration for the majority of my systems, since they act upon entities with 2-5 components. Your explanation causes me to worry about my massive soldier entities though. They will have 30-50 components.
Also, I don’t think physics broadphase algorithms would be appropriate for my scenario. I just want the distance between two points in space, hence the outer (soldier, ForEach) and inner (item or enemy, chunk iteration) loops.
The number of components don’t matter as much as the cumulative size. A chunk holds 16 kB, so an entity would have to use a full kilobyte-worth of components in order for the chunk to be reduced down to a capacity of 16. Granted, that’s actually possible with the use of dynamic buffers, but I know very little about your game and cannot make more of a statement.
As for just wanting distance between two points, that’s what broadphases are good at, some better than others depending on what kind of distance you want. If you are trying to figure out if two items are within a specified distance of each other, then nearly every broadphase will be faster than the brute force method for a large amount of entities.
This is what I always thought, but turns out this is not exactly true depending on your data.
Some of archetypes of ours are limited to just 4-6 entities in a chunk, yet their component size adds up to 300-600 bytes. I even went through and optimized all their layouts (there were only a couple of issues) and it barely made a difference.
I even wrote a test to prove this just to make sure I wasn’t going insane, simply adding 1 component at a time (of different sizes) then printing out the number of entities that can exist in a chunk.
It rarely came close to being ideal.
I was admittedly oversimplifying. Chunks also contain a bunch of other metadata so that jobs can navigate them in a Burst context. And there’s also some issues that cause Unity to be a little wasteful too. Those issues will be resolved in time.