Hi there,
Profiling overhead can be related to a number of things and you can reduce it in as many ways to get a better picture. I’ll try to list them in what looks most likely in this scenario to least likely.
Deep Profiling costs
You are using Deep Profiling, which automatically adds ProfilerMarker.Begin/End calls to all scripting methods & properties, which increases the profiling overhead per instance of these calls. The overhead per sample is relatively low but the higher the call count, the worse this will get obviously.
So the first thing you can do is to turn of Deep Profiling to get a better understanding of the overall distribution of time across the frame and how much your number of cars contributes to that. There should only be a minor difference between a build with “Deep Profiling Support” enabled but Deep Profiling turned off while recording, vs a build without Deep Profiling Support, but if you want to get the data even cleaner, you could also try that and compare the result to see how much of a difference this makes. Likely, having the support build in and being able to turn it on on the fly is worth it for the workflow of: Profile without, spot problematic areas, turn it on to dig deeper.
Manual instrumentation
Now, without Deep Profiling on, the high level overview you’re getting will stop at the first level of scripting calls that are called by Unity’s script messaging system (e.g. your CarAIController.FixedUpdate() calls and similar). The only samples that will appear underneath those are the ones that are explicitly instrumented with ProfilerMarker.Begin/End/Auto, either by you/within the C# scripts of your projects, or by such instrumentation in the our Scripting API Layer or Native code (e.g. GC.Alloc, GameObject.Instantiate…)
So if you want to get a bit more details without incurring the full Deep Profiling overhead: Add ProfilerMarkers in relevant places e.g. in CarAIController.Drive()/GetPath()/CarOverturned(). You can always refine this further as needed. If you think a particular bit of code is likely to have some bigger impact on performance or should be monitored closely, adding markers there could be valuable, not just to you but also to users of your assetstore package. You can also think about how your users are going to see this and how to give them information that is the most relevant for them, as well as some hints as to what they could due to mitigate the costs.
Data transferal cost
Beyond just the pure sample instrumentation cost overhead, the higher amount of samples will also need to be transferred from your phone to your desktop machine. Things to consider: Are you using Wifi or ADB? Maybe you can use the Profiler APIs to record data to a file on the device instead and then download it to your desktop machine for analysis.
Data gathered for Profiler Modules you’re not interested in
Every Profiler Module that is added in the Profiler window (or when profiling to file, enabled/not disabled via calls to Profiler.SetAreaEnabled()) will incur some overhead for gathering the data it needs to provide you with more detailed information relevant to that feature/area, how much varies greatly between the modules and the project. Typically high costs can come from e.g. GPU, UI, Audio, and sometimes the Memory modules. (The later usually indicates you’re trashing the memory so it kinda highlights a Problem more so than being one on its own.)
You can get some understanding of the cost of the Modules by looking at the main thread’s base sample (i.e. on the Same Layer as PlayerLoop and EditorLoop) for Profiling named either Profiler.CollectEditorStats or Profiler.CollectGlobalStats depending on what you are profiling. Underneath that each module that could incur a higher cost for gathering it’s stats at the end of the frame will have a sample indicating how long it took for that data gathering. Cost incurred over the duration of the frame, e.g. due to GPU profiling wouldn’t be captured there but it could give you an indication.
Anyways, long story short? Remove Modules for areas that are not your current area of focus to get cleaner data overall, then re-add them once you have a better idea of where to look to get more details for these areas.
Call Stacks
If you are hunting down GC.Alloc samples and are using Deep Profiling to pinpoint them, you should pobably turn on the Call Stacks toggle in the toolbar instead as a lower cost way of getting even more details on where and why they occur.
Lastly
We are aware of these workflow pains and costs that muddy the waters. We’re trying to reduce them wherever and however we can, offering ways like the ones described to opt in and out of them as needed, making it easier to add custom data to the profiler, improving the documentation to show best practices when Profiling and improving the stability, usability and performance of the PlayerConnection that is used for the data transfer. Nothing of what I wrote here should be seen as an excuse of ours so that we would not need to strive to make the experience of profiling better and addressing performance issues, especially in scenarios where these performance costs make profiling near impossible or even lead to crashes.
Speaking of crashes, if that happens, please report a bug so we can fix it. We’ve already ensured this quite a bit but we can always miss something. Also, if none of the ways I described help work around these costs to unblock you, we’d want to know about that too, so that we can use that feedback to know where else we need to reduce costs, offer opt-outs or fix straight up bugs with the profiling suit.