Combine IJob JobHandle dependencies

Hi,

I have trouble getting job dependencies working with an variable number of jobs.

This is what the general setup is now :

I have a source array src f.e. 10000 element → NativeArray(1000), readonly
I have a counts array counts 256 * numjobs → NativeArray(256*numjobs), write

I’s like to schedule numJobs jobs, parallel working on SubParts of these arrays

Job1 - src SubPart 0 - 999, count subpart 0 - 255
Job2 - src SubPart 1000 - 1999, count subpart 256 - 511

The jobs only work in their own subparts → no race conditions

It is a generic IJob struct, I defined it now as :

    public partial struct RadixCount<T> : IJob where T : struct, IRadixSortableInt
    {
        [NativeDisableParallelForRestriction]
        public NativeArray<int> myCounts;
        [NativeDisableParallelForRestriction]
        [ReadOnly] public NativeArray<T> mySrc;
        [ReadOnly] public int keyOffset;
        public void Execute()
        {
            for (int i = 0; i < mySrc.Length ; i++)
            {
                myCounts[(byte)(((math.asuint(mySrc[i].GetKey()) ^ 0x80000000) >> keyOffset) & 0x000000FF)] += 1;
            }
        }
    }

Starting these jobs with :

    public static class RadixMT
    {
        public static void RankSortInt<T>(NativeArray<int> ranks, NativeArray<T> src) where T : struct, IRadixSortableInt
        {
            const int sliceSize = 10000;
            int count = src.Length;
            int numThreads = (count / sliceSize) + 1;
            NativeArray<int> counts = new NativeArray<int>(256 * numThreads, Allocator.TempJob, NativeArrayOptions.ClearMemory);
            NativeArray<int> prefixSum = new NativeArray<int>(256 * numThreads, Allocator.TempJob, NativeArrayOptions.UninitializedMemory);
            NativeArray<Indexer> frontArray = new NativeArray<Indexer>(count, Allocator.TempJob, NativeArrayOptions.UninitializedMemory);
            NativeArray<Indexer> backArray = new NativeArray<Indexer>(count, Allocator.TempJob, NativeArrayOptions.UninitializedMemory);

            NativeArray<JobHandle> handles = new NativeArray<JobHandle>(numThreads, Allocator.TempJob);

            for (int t = 0;t<numThreads;t++)
            {
                handles[t] = new RadixCount<T> { mySrc = src.GetSubArray(t * sliceSize, math.min(sliceSize, src.Length - (t * sliceSize))),
                                                  keyOffset = 0,
                                                  myCounts = counts.GetSubArray(t * 256, 256)}.Schedule();
            }
            JobHandle.CombineDependencies(handles).Complete();
        }
    }

Also tried :

            JobHandle radixCounts = new JobHandle();
            for (int t = 0;t<numThreads;t++)
            {
                JobHandle j = new RadixCount<T> { mySrc = src.GetSubArray(t * sliceSize, math.min(sliceSize, src.Length - (t * sliceSize))),
                                                  keyOffset = 0,
                                                  myCounts = counts.GetSubArray(t * 256, 256)}.Schedule();
                radixCounts = JobHandle.CombineDependencies(radixCounts, j);
            }
            radixCounts.Complete();

But getting errormessages stating :

InvalidOperationException: The previously scheduled job RadixCount`1 reads from the Unity.Collections.NativeList`1[EndPoint] RadixCount`1.mySrc. You must call JobHandle.Complete() on the job RadixCount`1, before you can write to the Unity.Collections.NativeList`1[EndPoint] safely

When not running parallel

            for (int t = 0;t<numThreads;t++)
            {
                JobHandle j = new RadixCount<T> { mySrc = src.GetSubArray(t * sliceSize, math.min(sliceSize, src.Length - (t * sliceSize))),
                                                  keyOffset = 0,
                                                  myCounts = counts.GetSubArray(t * 256, 256)}.Schedule();
                j.Complete();
            }

It works OK.
What would be the correct way to handle the dependencies ?

I think what you want here is IJobParallelForBatch.

Thanks,

Did not know about this IJobParallelForBatch, definity can solve part of my solution.
Looks like this is an example of a custom jobtype, which (I hope) will be maintained by Unity.

But besides that, I do think that I will still need scheduling job and have their dependencies maintained by myself.
So question still is “How can I handle the dependency of an, upfront unknown, number of jobs”

In other words :
1/ Start a number of (same or different) jobs.
2/ Let those jobs all run in parallel
3/ Wait for the completion of all these jobs.

@DreamingImLatios : As you probably have seen I am trying to make a parallel version of your Radix sort.

Edit : My current model (above) does not fit into the IJobParallelForBatch.
It needs two separate startIndex/length pairs, one for NativeArray src and one for NativeArray count.

Should be. It is fully documented in 2022.2 and games and asset store packages have been using it for several years now.

You can only slice up arrays in parallel using a single parallel job. Since your strides are different, you will need to use [NativeDisableParallelForRestriction] on at least one of your arrays if you are writing to multiple arrays at once. If you have a [ReadOnly] array, you do not need [NativeDisableParallelForRestriction] on it, and you can read any index without issue.

The counting is the fast part of the algorithm. If it needs to be faster, I could rewrite it to vectorize it. I don’t know a fast way to do the element movement part of the algorithm in parallel. Also, I haven’t tried, because you need a lot of elements for the sort to become a significant performance bottleneck worth optimizing. Now I’m curious. What’s your use case for parallelizing it?