I have 2 Systems on my Renderer, the Culling System handles AABB testing and sorts the visible Entities and Collects their Matrices, the Rendering System handles slicing Matrices to 1023 batches and sending for Graphics.DrawMeshInstanced (so mostly main thread, non-burst).
In this case I have to call Complete on the CullingSystem’s JobHandle on the RenderingSystem. First think I’m considering is that I should put their Updates as far away as possible to let worker threads handle culling.
Another thing I’m considering is, I could get away with out of date data for Renderer. For Example at Frame 2, the Renderer could use the Culling result of Frame 1, while the Culling system handles Culling of Frame 2 without interruption.
Is this a pattern you’ve used before? Is this a good idea?
It can work pretty well if your camera has capped linear and angular velocity, never jump cuts, and you have a capped DeltaTime. Otherwise you will get artifacts.
Job scheduling management requires scope of your whole project and good use of the timeline profiler to see where the bottlenecks are. Therefore, it is not something you can easily reason about until the project has most of the desired functionality. Heavily jobified projects and sparsely jobified projects often require different actions to optimize scheduling. I do know some general techniques for architecting execution order. But those assume certain characteristic of the simulation step which are often not present.
I would need to know more about your project or at the very least see a timeline capture to provide more insight.
It mostly bothers me that I have to call Complete on the CullingJob, since the Renderer is non-bursted (DrawMeshInstanced usage). If I could make the Rendering task a Job then the CullingJob would be just added Dependency to Renderer. It feels like I’m losing multithreading advantage as CullingJob will be completed on the main thread. But it’s true that I need to profile this, I’m just designing the system at the moment
You won’t be able to escape calling Complete() somewhere in your render logic. But you can move a lot more of your rendering logic to jobs. In order of increasing performance:
Use GC Handles to forward managed arrays to non-Bursted jobs and use NativeArray.CopyTo to perform a fast copy inside the jobs (which will be very close to Burst performance).
Use DrawMeshInstancedProcedural and use jobs to populate a ComputeBuffer. Bonus if you use the direct buffer mapping APIs (you have to use async readbacks to figure out which buffers from previous frames are still in use).
Use BatchRendererGroup or the Hybrid Renderer, which are capable of retaining most of the render state between frames on the GPU.