[Broadphase flaw] Does Physics.OverlapSphereNonAlloc ignore LayerMask filtering internally?

I’ve encountered a strange issue (Unity 2019.3.5f1).

Physics.OverlapSphereNonAlloc performance is directly affect by collider count of different layers, even if LayerMask doesn’t contain them.

E.g. if there’s a thousand colliders in the scene, but the tested masks do not contain them - they’re still affecting the performance.

(1000+ on other layers, <10 on a test layer ~ 14ms)

This is on i5-8400:

Can anyone explain me why does colliders from drifferent layer affect OverlapSphereNonAlloc performance when the mask doesn’t contain that layer?

I’ve saw similar performance degradation in other projects too.

The one for mobile I’m working on has thousands colliders on one layer, and performing OverlapSphere on different layer was causing a major spike. In the end I was forced to write a custom overlap detection system to replace OverlapSphere completely.

Here’s a method signature, just in case:

public static int OverlapSphereNonAlloc(Vector3 position, float radius, Collider[] results, int layerMask)

Does physics engine only filters results, and not the input data?

Here’s an comparison of sphere to overlap box setup on 2020.1.6f1:

Same collider quantities, same layers. It has to be a bug.

I’m not a 3D physics dev but a handful of things that I can think of are:

  • The LayerMask is not being set to a mask correctly.

  • You’re setting Transforms and have AutoSyncTransforms On (So sync costs per-query which can in theory scale by colliders changing position)

  • The cost is the querying the broadphase which will be affected because it can’t filter by layer until it knows what is potentially in the results. This saves performing the sphere vs narrowphase check only.

  1. It is set correctly;
  2. I do have it on.
  3. Why does the same behaviour not occur when OverlapBox performed?

OverlapBox is way faster than a sphere but it makes no sense in that regard.
In this case it has even more calls, but performs like 57 (*2) times faster than OverlapSphere.

Checking spheres should never be more costly than boxes (AA moreso).
(Unless there’s some weird magic trick used by Unity / physx that simplifies box broadphase)

TL;DR: Why is the OverlapBox so much faster?

Okay, so this gave me some thoughts:

If this is true, I think I know what’s going on.
I’ve got a large OverlapSphere “trigger” that covers the whole map, however, it is only set to <Player, Enemy> LayerMask.

This means almost every collider in the 4000x4000 scene would get picked up by the broadphase first.
Then filtered away. Which means its more of a design flaw that people should be aware of.
(Or, at least manual should state this)

There aren’t any OverlapBoxes of the same size, which is why it is blazing fast.

However, this doesn’t explain other cases where there’re thousands of colliders, and small OverlapSpheres (in other projects).
Guess its the broadphase traversal cost?

In any case, I’ve been replacing it with a custom overlap testing for the specific “objects” e.g. player and enemies with a different setup + burst. So that should mitigate cost of those checks to almost zero.

Thanks, that was really helpful.

But surely that’s like saying that you add stuff to a hash-map and it’s a design flaw when you search for something and it has to check multiple hashes from a bunch of things (assuming a non-unique hash) before it narrows it down to one. The hash-map has massively reduced what it needs to spend time on. The goal of the broadphase is similar and is to enable it to avoid the much more costly narrowphase i.e. collide geometry intersection tests. The broadphase is used to find candidates and part of discounting them is filtering by layer as well as region the query is interested in (AABB). That process by its nature is very fast and I’ve not said it’s dominant in your times here but it could be; just that if you’re interested in a single item in a single layer and your query covers thousands of potentials then they have to be discounted quickly by checking their layer. Checking the layer HAS to be checked somehow by the CPU, this is where it’s done. Even if you roll your own, you will still have to ask everything in that region what layer it is in and discount it and that is all the broadphase is doing.

With all respect intended, the design flaw here is in your project doing a huge area query which can be expensive as there’s a lot of potentials.

The disparity between the sphere and box isn’t something I can explain but then again, I’m not a 3D physics dev. The narrowphase part of that i.e. the sphere or box to other collider is also confusing as a sphere to anything intersection test is much quicker (as it’s a simple radius check) compared to a box intersection check which is more complex. This means the cost is elsewhere for sure and given the same volume or at least the same AABB of those queries, the broadphase part will be the same. Very confusing indeed and I wish I knew what it was.

1 Like

Its okay, I get it that I’m misusing physics heavily here. I’m not rolling physics solution, I’m just replacing those
OverlapSphere’s with plain narrowphase w/o broadphase testing.

I have only one player, and enemy count probably won’t go beyond 1-2k or more due to other reasons, so bruteforcing ~10 Sphere ↔ N Sphere checks via jobs is fast enough for me (<1 ms is fine). If those costs rise, I’ll just some kind of broadphase on top.

TL;DR: Its more of a “I don’t need broadphase” in this case.

Also, I don’t think this physx behaviour is mentioned anywhere.
Like the manual states - just use LayerMask, and it should negate the cost.