I refactored a System that scheduled 2 bigger jobs so that it now schedules 3 smaller jobs. What i thought i would achieve is a little bit more parallelism because i was then able to specify more precisly which components need to be read and written to for each job.
Before refactoring:
Both jobs had the same Query:
EntityQuery = ComponentA(ReadWrite), ComponentB(ReadWrite), ComponentC(ReadOnly), ComponentD(ReadOnly)
The first Job read from C and D and wrote to A
The second Job read and wrote to both A and B
Important to note is that this is a generic System and there are like 20 of those running. Each is scheduling those jobs (ScheduleParallel). I know that A is unique to each System, B is the same in every System and C and D could be the same between Systems.
After refactoring:
3 diffrent queries for each job:
firstQuery = ComponentA(ReadWrite), ComponentC(ReadOnly), ComponentD(ReadOnly)
secondQuery = ComponentA(ReadWrite)
thirdQuery = ComponentA(ReadOnly), ComponentB(ReadWrite)
First Job : Gathers data from C and D and writes to A
Second Job : transforms some data in A
Third Job : reads the processed Data from A and writes to result B
Now instead of getting tighter packed jobs from all those generic systems they are now spaced out alot more…
Did i by accident actually limit the parallel access to a Component between those instances of my generic System? If not, what else could be the reason i see less parallelism?
This is just something i try to understand but doesn’t actually matter for my progress right now. As it turns out it was way more performant on the main thread to only have 2 jobs and 1 Query anyways:
2,3ms vs 1,5ms for all instances of that generic system.
Is it generally speaking a good idea to schedule bigger jobs instead of multiple smaller ones? Atleast when the jobs use nearly the same data just with diffrent write access it seems like it.
Unless I misunderstood something I think the confusion is that the dependencies are per system, not per job. So ECS gathers all your jobs for a system into a single dependency which in your case would need to be completed by other systems before their jobs can be run.
So dividing it up into more systems would make it more parallel, but yeah, fewer systems are in general easier on the main thread.
Thank you! This is exactly what’s happening. Im really sorry for this confusing question.
Right now every System runs all 3 Jobs before the next set of Jobs of the next System runs.
What id like to Achieve is that Job1 and 2 of all the Systems could mix and only all the Job3 wait for each other.
But as you said it is not really a viable option to divide it into more Systems.
Would it be possible to somehow handle the Dependencies of Job3 manually to achieve this?
Job3 needs to run after Job2 of its respective System.
No Job1 or Job2 of any System is dependent on any Job3.
The current implementation of inter-system dependency is fairly simple. Between systems, Dependency.Complete() is called.
This results in a sync point in the transition between systems regardless of any conflicting job queries. This is done because, unless you specify the order in which systems update, they may not be run in the same order in the build version as in the editor.
However, you can start interweaving dependencies between different systems by manually controlling the update order and storing the dependencies in an IComponentData (JobHandle is a struct which consists of an IntPtr and an int) that you then can use to schedule jobs instead of using the default Dependency.
It is annoying and JobsDebugger will complain so you’ll need to turn it off (Jobs → Toggle JobsDebugger). All it does is make sure all jobs are completed within the system they are initialized in and moving the JobHandle to a IComponentData will invalidate that check.
This isn’t true. This would imply there’s a sync point between every system. That would be awful for performance.
Really what happens is that there are read and write job handles for every component type, and after every OnUpdate(), Dependency gets used to update all those handles.
I completely agree but from a lot of testing personally, there is a sync point. Consider these photos:
This is using the default system dependency handling.
And this uses manual dependency passing between the two systems since I know Job1 and Job2 does not share dependencies.
Of course, there are sections where I must call Dependency.Complete() due to some data sorting requirements but there is a clear delineation between systems resulting in jobs not properly overlapping if there are no conflicting dependencies.
Spending more time, I could reduce the clear system boundaries and Dependency.Complete calls to just 2 instances.
Of course, I am switching around ScheduleParallel() and ScheduleSingle() instances for best performance with all of these jobs but otherwise the code is identical.
Edit: Also, these jobs are IJobEntityBatch structs (I use a lot of ChunkComponents and SharedComponents). The lambda codegen may have additional features that manage dependencies properly. I have not tested those yet so you may be correct.
“Sync points” specifically refer to when your job threads synchronize with the main thread. Since none of your screenshots show the main thread, I can’t tell if anything is a sync point or not.
What I do know, is that Job 2 has an unnecessary dependency on Job 1 because Job 1 is in the same system as some Job 0 which Job 2 does have a real dependency on. The solution for this is to either split your systems, or sometimes these gaps just get filled in as you add more systems which operate independent of each other.
Ah, I must be using the wrong terminology. Apologies, self taught ECS coder here.
Huh. So you’re saying that I should atomize by job scheduling by making (ideally) every job a different system and the automatic scheduler would properly fill in the empty time between jobs?
I have a well established file and system layout so it’s too late for me in my current project but something to keep in mind. I’ll keep using my current Dependency singleton passing to interweave jobs but maybe next time I can get Unity to do it automatically.
Correct. This is why we only care about main thread synchronization. The more jobs and systems you add, the more efficient the job scheduler can be, but main thread sync points hurt that a lot.
Pretty easy to show that different jobs from different systems can run at the same time if they have no dependencies in their chain even if they’re working on the same entities.
Source
public class TestSystem1 : SystemBase
{
protected override void OnCreate()
{
this.EntityManager.CreateEntity(typeof(Rotation), typeof(Translation));
this.EntityManager.CreateEntity(typeof(Rotation), typeof(Translation));
this.EntityManager.CreateEntity(typeof(Rotation), typeof(Translation));
this.EntityManager.CreateEntity(typeof(Rotation), typeof(Translation));
}
protected override void OnUpdate()
{
this.Entities.ForEach((ref Translation translation) =>
{
var result = translation.Value.x;
// Just some busy work
for (var i = 0; i < 1000; i++)
{
result = noise.snoise(new float2(result, i));
result = noise.snoise(new float2(result, i));
result = noise.snoise(new float2(result, i));
result = noise.snoise(new float2(result, i));
}
var output = translation.Value;
output.x = result;
translation.Value = output;
})
.ScheduleParallel();
}
}
public class TestSystem2 : SystemBase
{
protected override void OnUpdate()
{
this.Entities.ForEach((ref Rotation rotation) =>
{
var result = rotation.Value.value.x;
// Just some busy work
for (var i = 0; i < 1000; i++)
{
result = noise.snoise(new float2(result, i));
result = noise.snoise(new float2(result, i));
result = noise.snoise(new float2(result, i));
result = noise.snoise(new float2(result, i));
}
var output = new quaternion(rotation.Value.value);
output.value.x = result;
math.normalize(output);
rotation.Value = output;
})
.ScheduleParallel();
}
}
Thank you guys. That discussion is exactly what i needed!
TLDR of this thread for me would be that I generally should not care about holes inbetween jobs and prefer scheduling less (albeit bigger) jobs to optimize mainthread times.
Dependency.Complete() is called before OnUpdate() but only to make sure the jobs from last frame is complete.
From SystemState.cs source code (I think m_JobHandle is same as Dependency property.):
internal void BeforeOnUpdate()
{
BeforeUpdateVersioning();
// We need to wait on all previous frame dependencies, otherwise it is possible that we create infinitely long dependency chains
// without anyone ever waiting on it
m_JobHandle.Complete();
NeedToGetDependencyFromSafetyManager = true;
}