Massive performance drop at large number of collisions in barebones Unity Physics demo

Background
So I am evaluating what order of magnitude of objects I can expect my game to handle, and I have set up a barebones physics demo to check out a large number of simultaneous collisions.

Set up

  • a floor which is a large plane with a static Physics Body, and a Plane Physics Shape.
  • a single system which at start spawns 20.000 cubes from a prefab which has a dynamic Physics Body, and a Sphere Physics Shape.
  • the cubes are spawn with random locations within a set volume so that they all have some air time before they reach the floor; and the spawning volume can be considered fairly packed.

I also have various settings to optimize performance:

  • a Physics Step with Unity Physics, 1 iteration count, and Multi Threaded on. (I have tried Havok as well with little to no difference.
  • Jobs with Use Job threads, no Jobs Debugger, and leak detection off.
  • Burst compilation with Synchronous Compilation and Safety checks off.
  • disabled all shadows.

Results
The spawning, as well as its packed nature, results in a high frame time initially; however, after settling down and during free fall I get a CPU frame time of 20 – 40 ms.

This looks something like this:
7072630--840964--Skärmavbild 2021-04-24 kl. 14.04.27.jpg

And profiling it yields this hierarchy:
7072630--840967--Skärmavbild 2021-04-24 kl. 14.02.45.png

Then after the majority of cubes has reached the floor, and very frequently collides with it, and/or with other cubes, I get a CPU frame time of 400 – 500 ms.

This looks like so:
7072630--840970--Skärmavbild 2021-04-24 kl. 14.04.47.png

and produces this profile hierarchy:
7072630--840973--Skärmavbild 2021-04-24 kl. 14.03.03.png

In other words, my bottleneck seems to happen for WaitForJobGroupID
The lion's share of which consists of:

BroadPhase:smile:ynamicVSDynamicFindOverlappingPairsJob (Burst)
NarrowPhase:ParallelCreateContactsJob (Burst)
Solver:ParallelBuildJacobiansJob (Burst)
Semaphore.WaitForSignal
Solver:ParallelSolverJob (Burst)
Question
My question is simply, is this result to be expected? And/or is there any more optimizations I can try?

Thank you!

1 Like

As of Unity Physics 0.6.0 the physics systems run in the FixedStepSimulationSystemGroup, which looks at the deltaTime per frame and steps the physics deltaTime/fixedTime times. The fixedTime defaults to 60. As the simulation cost rises, the deltaTime per frame rises, which means the Fixed system group calls Physics more times, which rises the cost of the frame... i.e. a vicious cycle.
Details on the Fixed Step approach are here: https://discussions.unity.com/t/793091

World.GetExistingSystem<FixedStepSimulationSystemGroup>().FixedRateManager = null; will turn off fixed-step and make the fixed-step sim group behave just like any other vanilla system group.

3 Likes

I did a quick check on setting the FixedRateManager to null. This makes the physics systems step at a pretty high timestep which introduces a lot of motion delta, leading to a large number of collision pairs and an expensive simulation.
I found setting the World's MaximumDeltaTime to the fixed group Timestep (default i.e. 60hz) worked for me.

[UpdateInGroup(typeof(FixedStepSimulationSystemGroup), OrderFirst=true)]
public class AaFixedSystemGroupUpdateSystem : SystemBase
{
    protected override void OnStartRunning()
    {
        var group = World.GetOrCreateSystem<FixedStepSimulationSystemGroup>();
        World.MaximumDeltaTime = group.Timestep;
        //group.FixedRateManager = null;
    }
}

This keeps the simulation stepping at 60hz but stops the system group from stepping the simulation multiple times.

5 Likes

Thank you for this, your explanation and your succinct example! I also did a small test setting the FixedRateManager to Null, but for me it seems like all physics just stopped (which I guess should make sense).

I had better luck experimenting with the World.MaximumDeltaTime and group.Timestep (keeping your notation) – especially sacrificing time resolution by reducing the group.Timestep helped vastly.

Though steeveHavoks’s suggestion was already more than I could’ve hoped for, I’d of course be very interested if there are other ways to further increase performance.

1 Like

Worked like a charm for me, with no additional tweaks.