There is a function in the transform class SetPostionAndRotation which combines 2 variable sets in 1 internal (c++) API call.
I would love to see this for local position and rotation as well!
you can implement your own extension
public static class MyTransformExtensions {
public static void SetLocalPositionAndRotation(this Transform xf, Vector3 localPosition, Quaternion localRotation) {
xf.localPosition = localPosition;
xf.localRotation = localRotation;
}
}
That still makes 2 internal API calls to the c++ side of the engine.
It’s about performance, not code quality
The performance difference here is going to be negligible.
I think the main point of orion’s post is a work around until/if Unity adds the function.
I’m not saying I disagree with your original statement though, it makes sense to have one function for it all.
Look, the differences are literally negligible. However, if you go ahead and benchmark SetPositionAndRotation to prove me wrong, keep in mind that they are bundled together exactly because certain calculations can be optimized when transforming to world space, especially in large hierarchies.
In local space there are no calculations, just plain assignment.
I would claim the difference is negligible even in the world space, but I am willing to admit there are some compounding issues in very hot paths, and transform APIs should definitely be designed for performance.
While it is negligible for a single use, if used in Update and/or for many objects it can definitely matter.
Looking at the differences from my benchmarks the combined SetPositionAndRotation call is about twice as fast.
This is also something mentioned in the official performance guides from Unity, both in their performance e-book and Intellisense suggestions.
Here it is said this performance hit is due to the communication between the C++ and C# parts of the transform component. I assume the same benefits could be seen with local position and rotation, as they also use an internal call each.
If this is because of code optimizations for world space coords, I would love to know the optimization being done for this and confirmation this is the actual reason for the performance benefit.
Benchmarks:
10k iterations:
1m iterations:
Code used:
int count = 1000000;
// Start is called before the first frame update
void Start()
{
Profiler.BeginSample("double");
for (int i = 0; i < count; i++)
{
transform.position = Vector3.zero;
transform.rotation = Quaternion.identity;
}
Profiler.EndSample();
Profiler.BeginSample("combined");
for (int i = 0; i < count; i++)
{
transform.SetPositionAndRotation(Vector3.zero, Quaternion.identity);
}
Profiler.EndSample();
}
This is backed up when adding local position and rotation, as it takes as long as the double assignment for world coords:
As I said, world and local space do not relate to each other one to one. Setting world position and rotation basically necessitates the local matrices to be recalculated inversely, while setting the local ones simply assigns them.
But the next unavoidable step in the overhead comes from the need to compute the world position and rotation again and for all the affected children, because this is what is effectively displayed. That being said, there are countless possible optimizations to skip certain steps or to make things more cache-friendly, with less computation involved.
Namely setting child’s local space affects only its transform, but setting its parent’s local space affects both the parent and the child. This example is deliberately out of touch with the reality where the scenes are usually more complex, but feel free to extrapolate the core idea.
So the question is are you sure you’re able to isolate the steps as well as the complexities arising from the system, properly in your benchmarks? I’m not sure it’s even possible given that it’s all under the hood. In other words, what you see as a pure assignment is instead a lot of computation as tightly packed as possible, with non-trivial optimizations you cannot possibly fathom, and you would probably gain nothing if there was a compound method to assign local values anyway.
The reality is that if you’re affecting the local space of a parent with a deep hierarchy (or lots of children), you’re practically invalidating that many transforms, whose local matrices are then repeatedly multiplied with their parents’ world matrices. Of course there are certain optimization and caches in place, but there is a natural limit to that, as you still need to deterministically come up with exact transformations. Your benchmark does not show how complex your scene is and therefore your benchmark is likely throwing you off into thinking it’s all about assignments.
It’s not. Assignments are immensely ultra fast on today’s computers, regardless of bridging C# and C++ – we live in 2022 after all. The bottlenecks are due to something else, likely matrix reintegration, and it’s already as optimal as it can get under the hood, you can’t do much about it (edit: at least with this type of architecture), unless you start being more proactive regarding the complexity of your scene and the amount of updates with which you stress your hierarchies, which is something else.
I think the calculations of the childs happens on the c++ side? This is just a guess.
I ran the same benchmark again with 500 and 8000 objects childed to the parent and still found the merged function to be about twice as fast.
This might be because if you set the position first and the rotation second it does all the child calculations twice (once for each assignment). Whatever the cause is I think having the same function for local space might be beneficial.
I also used randomized values to rule out any caching in the benchmark and tried the benchmark script on child objects as well.
If you still disagree with me please provide benchmarks of your statements to backup those statements
This isn’t a contest and I actually never said that SetPositionAndRotation is in any way meaningless.
However, what neutral state are you using to back up your statements? It is twice as fast compared to what?
You are decidedly falling into a trap I’ve warned you about in my second reply.
edit:
If you still don’t understand me, here is the ultimate question: ‘why do you believe you would gain the same performance benefit if there was such a method for local space?’
And whatever your answer might be, it is merely a superstitious guess. My position is that you would gain almost nothing, because you are wrongly estimating the degree of optimizations used for the world space and world space only.
edit2:
In other words, you don’t see performance BOOST when you use this, instead you see how slow the thing is without it, which is exactly why the method exists. This problem likely DOES NOT EXIST in local space.
Stop believing that the assignments or calls themselves are slow, what you see is what you get. That’s your actual performance benefit. That’s the whole point I’m trying to make. There is no magical realm of ludicrous speed on the C++ side when it comes to assigning the numbers. It doesn’t go through hell and back. The assignment itself is likely measured in nanoseconds. TEST IT!
Setting local or world transforms with posittion and rotation seperate takes twice as long as setting world transform values in 1 call. I assume the same difference could be made if combining local transform calls into 1.
English is my second language, so maybe I overread something in your message. But I did not get much clarity from your message why the change would be negilible
Sorry, I don’t have any c++ experience. I mainly got this info from the Unity performance guides, which I took as truth.
I do not really care where the exact performance win is from, but I know that it could be a performance benefit to have the same functions for local and world coords
I can see why you would believe it’s that simple. But don’t just assume it’s simple, test it.
Make a proper testing ground, try working only in local space, then in world space. Make a test where you just assign stuff to local space, then progressively add parents without changing anything else.
Do the same thing where you modify only the parent’s transform, but progressively include children.
Now offset the children so that they cannot rely on caching. Get a feel for what happens under the hood.
Also learn how matrices work, why they do what they do, and where the actual optimizations might lie.
And you know what, maybe you’re right!
I’m not 100% sure as I haven’t seen the C++ side myself (it belongs to proprietary code).