LateBehaviourUpdate costs 1.17ms, but the sum of all its children is 0.81ms. That’s a difference of 0.36ms and I would like to understand what that is.
What’s the 0.36ms? How can I find out?
Profiled IL2Cpp build:
LateBehaviourUpdate costs 1.17ms, but the sum of all its children is 0.81ms. That’s a difference of 0.36ms and I would like to understand what that is.
What’s the 0.36ms? How can I find out?
Profiled IL2Cpp build:
I am curious myself if my assumptions (I think someone once told me 25 years ago or so) hold true because I just came to accept these numbers to never match up.
As far as I understand profilers, and I believe this sort of thing is common with all of them, there’s always some unaccounted overhead, usually the method call itself. The parent sums up the method call overhead but the profiler markers for a given method start and end within a method, and thus does not include the call overhead (things like: putting parameters on the stack, jumping to the first method instruction address, loading the return value onto the stack)
My understanding is that the 1.17 ms includes both the measuring and method call overhead of everything going on “inside” that profiler marker. Then, all the numbers you see as “0.00” aren’t exactly existing outside of time and space, they’re likely just rounded down to 0.00.
There may be several 0.004999s shown as “0.00” that could amount to a good portion of the “missing” time, depending on how far down these zeros reach. On the screenshot alone it may amount to .05 ms but the 0s probably go further down quite a bit, so perhaps it could be .10 ms or more.
yes, I believe this is what is the biggest contributor to the unknown here.
The “total time of a sample” == “sum of all child samples” + “self time”, so the visible time would be 0.81+0.22=1.03, and we have 1.17-1.03=0.14ms of unaccounted time.
Given that we round all numbers to the x.00 it could totally be that there are numbers which have third decimal digit <5 and thus the hidden accumulated number can be e.g. 23*0.0049 = 0.1127ms.
There is still a gap which could be also that there are more child samples or that there is a floating point accuracy issue - we store sample time as floats.