I am trying to parallelize chunk loading in my game. I’m considering several things and jobs seem really nice. I’ve never used jobs before but I’m quite familiar with low-level data management and I have some ancient experience with multithreading. Trying to overcome the limitations posed by jobs is quite the challenge though. I am NOT using ECS, this is pure job system and MonoBehaviours.
I’m trying to generate a 2D array of noise points in a job. The first hurdle I needed to overcome was that I use the FastNoise2 library. This is a library written without jobs in mind (it has state through a sort of node system, and it’s even a native library), but in theory, generating noise only reads from the noise generator and doesn’t write. My current solution is to use the [NativeDisableUnsafePtrRestriction]
and it seems to be working (it gets past the noise gen line without errors). If anyone has a better idea for this, let me know, but this is not the biggest problem (worst case, I’ll write the noise generating function myself, I don’t need state anyway).
The second hurdle is outputting the points. I’m really breaking my head over this and I seem to be getting nowhere. The (simplified) structure of my job is as follows:
public struct ChunkData {
public Vector2Int startOffset;
public float[] noiseData; // <----- What do I put here???
}
public struct GenerateNoiseJob : IJonParallelFor {
[] public FastNoise noiseGen;
public NativeArray<ChunkData> chunks;
public void Execute(int chunkIndex) {
Vector2Int offset = chunks[chunkIndex].startOffset;
float[] noiseMap = noiseGen.Gen2D(size, size, offset.x, offset.y) // actual call differs but this is the gist
for (int i = 0; i < noiseMap.Length; i++) {
// Do some additional processing....
chunks[chunkIndex].noiseData[i] = noiseMap[i]
}
}
}
For each chunk, I need to generate a noise map and then I need to process that noise map before I send it back to the managed code. Fairly simple. Except that the above code is not allowed, despite it not containing any data that could change. I know this, but the compiler doesn’t and there lies the problem. The noiseData
array is of known size and will only contain blittable data types, but any sort of nested collection is disallowed inside that Data struct. On top of that, I’d honestly really like it if the data was not copied at all, but instead had a reference to a pre-allocated chunk of memory in which it could modify the data. I can make sure this stays thread-safe.
So the second question is, with nested arrays disallowed, how can I operate (r+w) on a fixed-sized collection of pure, blittable data for each job in a parallel job? Possible things I’ve found are doing some unsafe things with IntPtrs (I suppose that’s entirely manual data management), or using something like the FixedListFloat32 for points. Disadvantages are that IntPtr is better avoided (but I can avoid copying!), and the FixedList really only supports three data types of different sizes (packing is an option but also more work), so I can’t move a lot of the processing logic to the job if I go that route.
The third hurdle is perhaps the most complicated and might deserve it’s own thread. I’ve not properly researched this yet, but the noise data is being used to load in-world chunks. Chunks don’t need to load all at once in the frame where they are requested; they can arrive staggered at some point in time after they are requested. The IJobParallelFor seems to wait until all parallel jobs are done until it returns. If it needs to generate 13 chunks on 6 workers, and assuming every chunks takes exactly as long, 12 chunks are already done while the 13th is still being generated, but none of the data can be used until that last chunk is done. This is inefficient; I can create a GameObject the second the data for that job is ready. No matter how many other chunks still need to be generated.
I suppose for the last question I’m wondering how to approach this with jobs. IParallelFor seems to be out of the question (single blit back and forth for the entire batch). A plain IJob seems okay from what I understand about the system but I don’t have enough information to know if I can actually leverage that to achieve what I described above. (I don’t know if it actually works that way).
Obviously I can also use plain C# threads. I’m more familiar with the workflow and I think it would boost performance significantly. Most of the issues I mentioned here can be solved fairly trivially with some clever programming. But partly as a learning challenge, and partly because this is the Unity way of doing things and therefore likely the most performant, I’d really like to use the job system. I would think what I’m trying to achieve is the kind of thing the system is designed for, so I assume there are many things I’m unaware of…