NativeStream: Is there any way to know in advance how many "threads" an IJobForEach will have?

I’m considering using NativeStream, but there’s one thing I’m not sure about: how to you define your stream’s “foreachCount” if you want to write to it from a IJobForEach?

With an IJobParallelFor, you control the count so it’s not a problem. With IJobChunk that count will be the nb of chunks (is that correct?). But with IJobForEach, can you know?

Also, am I correct in assuming that two IJobChunks won’t be able to both write to the same stream because their thread count could vary? Unless we initialize the stream’s “foreachCount” with the largest chunk count of all the IJobChunks that will attempt to write in it?

Is there a way to find a safe maximum “foreachCount” that we can init a NativeStream with to make sure any job will be able to write to it? And will there be a noteworthy performance penalty if most jobs don’t use every index of the stream?

IJobForEach works the same way as IJobChunk, per chunk. JobsUtility class has a max thread count or something like that. That said, the number of threads should be something like math.min(JobsUtility.MaxThreadCount, chunkArray.Length). You can pass a custom thread when scheduling the job or use JobsUtility to get the generated query. Unfortunately, at least for what I know, the only way to get the chunk count is creating a temporary NativeArray using CreateArchetypeChunkArray from EntityQuery. That said, you don’t know exactly what thread index you will get, so (maybe) creating enough with the max thread count would be the way or maybe the allocation is lazy when first accessed?

  • I’m on mobile now so the exactly names could be wrong.

You are starting to run into some of the issues with NativeStream. You can use IJobForEach with NativeStream by allocating a foreachCount of the count of entities (you can get this from the EntityQuery), but this does not perform very well and is usually the reason I end up using NativeQueue and then sorting the results if I care about determinism. But in general, if you want performance with NativeStream and entity iteration, IJobChunk is the weapon of choice, with a foreach of the EntityQuery’s chunk count.

As for having multiple chunks write to the same ForEachIndex, you need to use IJobParallelForBatch for that. Is there a particular use case you have in mind?

oh no I just meant:
If I have a IJobChunk with 4 threads and a IJobChunk with 6 threads, is it possible to make them write to the same stream (not the same stream index)

I’m guessing the answer is to just have a stream with max possible foreachCount

I’ve done this Phil with seemingly no ill effects. Create the stream with a foreach count of 10, and offset the 2nd jobs start index by 4.

From memory you have to disable container safety restrictions.

If you want to do chunk processing then you would use EntityQuery.
CalculateChunkCount() as foreachcount on allocation and BeginIndex with the chunkIndex passed into IJobChunk.

But really whatever works for the way you want write to addressable indices and read from later on. There is no noticable performance impact of having large amounts of indices, as long as you dont completely abuse it and have 1 int per foreach index. (It wouldn’t be super bad, but obviously not the intended setup for performance)

This way you also guarantee determinism. Memory is still allocated on a per thread basis, so will be shared between multiple chunk iterations. It’s quite awesome how NativeStream fits together once you grok the pattern. It’s so far the fastest pattern for queuing I’ve ever seen used in multithreaded code.

4 Likes

just making sure, did you mean to write “there is no noticable performance impact” here?

1 Like

Yes. no noticable perf impact.

2 Likes

So if I do this:

int myForeachCount = JobsUtility.MaxJobThreadCount;

Can I assume that the foreachCount will always be sufficient, even if I end up in a hypothetical situation where an IJobChunk has more chunks than “JobsUtility.MaxJobThreadCount”, which is currently 128? Is it even possible for an IJobChunk to have “chunkIndexes” that are greater than MaxJobThreadCount?

Basically, I need to design a system that uses NativeStream and that will have to be useable by anyone (it’ll be an asset store package one day, maybe). So I can’t know the max foreachCount in advance. I need a way to know the worst case scenario, but without setting the foreachCount to something ridiculous like int.MaxValue

1 Like

In Physics we determine the foreach count on a job using
ScheduleConstruct.

The important part is that you assign the index deterministically. With a fixed number but dynamic workload i dont see how thats possible.