We commonly instantiate up to 100 or so objects, set their transform localPosition, and then reparent them to a container object (with worldPositionStays = false). Prior to unity 5.4, this has never shown up on the profiler as an issue, but in 5.4 it is commonly taking up to ~1ms per call, resulting in huge 100+ms spikes.
Hi @ , would you be so kind as to let us know if this has been scheduled for a particular beta release yet? I use a hierarchy-based node system and find the SetParent performance issues are making the software difficult to operate compared to a 5.3 version.
The changelog says SetParent performance was improved in b21 and we should give feedback
In our project, loading levels takes roughly twice as long in 5.4b21 than it did in 5.3.5 (30s → 60s), and according to the Profiler all of that extra time is spent in SetParent. During normal gameplay we get lots of lag spikes with SetParent taking about 15ms per call. Only on some calls though:
Please note that beta 21 had some big speedups in regards to SetParent & Destroy in sub hierarchies.
Generally 5.4 SetParent performance is now better than 5.3 SetParent performance in the most common cases. But there are ways to get worse performance. So lets detail how the code is different.
Unity now allocates a big block of memory that is as big as the entire hierarchy, containing transform rotation / scale / position / parent indices etc.
For best performance:
Append or destroy from the end of a hierarchy (depth first). It is always the fastest operation. When building large hierarchies, always append to the end. (The profiler shows you when you are not taking the fastest path.)
SetParent internally creates a big array of all transform nodes. We now automatically resize to 2x size when capacity runs out of space. You can set the hierarchyCapacity before if you know how many objects you are going to add into a hierarchy, this can usually give you a 20-30% speed up when building large hierarchies ( > 5000 game objects)
In the tests we have run where you do correctly set hierarchyCapacity + append at the end of the hierarchy Unity 5.4 SetParent performance is better than 5.3 SetParent performance.
What to avoid:
Dont destroy from beginning of the transform hierarchy. The fastest operation is always to destroy the whole hierarchy at the root game object. If thats not possible, always try to destroy from the end, not the beginning. Otherwise unity has to do a lot of copying to re-pack the data
Dont SetParent to transforms at the beginning of the hierarchy. Try to build hierarchies in a way where SetParent is applied to the end or relatively far at the end. (The amount of data copied is dependent on how many transforms from where you parent to to the end of the hierarchy)
@Joachim_Ante_1 : I’ve uploaded the project as case #804765
Also thanks for the explanation for how it works now. Note that we haven’t done any optimizations yet to make the best use out of this new SetParent behaviour so there’s certainly room for improvement on our end - the main problem seems to be that we parented everything (~55k objects) to a root object to keep the hierarchy tidy, which was fine in 5.3.
I looked through your project folder. Some hierarchies are applied in a way that will lead to good performance:
landChunksParent = new GameObject("LandChunks");
landChunksParent.transform.parent = transform;
int chunkCountX = Mathf.CeilToInt((float)xSize / LandChunk.chunkSize);
int chunkCountZ = Mathf.CeilToInt((float)zSize / LandChunk.chunkSize);
landChunks = new LandChunk[chunkCountX, chunkCountZ];
int chunkSizeX = xSize / chunkCountX;
int chunkSizeZ = zSize / chunkCountZ;
for (int x = 0; x < chunkCountX; x++) {
for (int z = 0; z < chunkCountZ; z++) {
LandChunk landChunk = Instantiate<GameObject>(landChunkGO).GetComponent<LandChunk>();
landChunk.transform.parent = landChunksParent.transform;
}
This results in good performance. transform here is a root game object, you are additional transforms to the end of the hierarchy, which is faster in 5.4 than in 5.3.
In other places you place a game object named “unclassified” first, add some instances to it as children, then add a bunch more game objects to the root, then afterwards go back and add more to the “unclassified” game object. In the mean time you have added many more game objects at the end of the hierarchy. This results in bad performance because Unity now has to copy the transforms that were added to the end.
You can solve this in two ways:
make the unclassified game objects be root game objects
only apply game objects at the end of the hierarchy. Basically create a game object, parent it, add children, and don’t change it once you have parented in locations afterwards.
In this case you have 50000 game objects and you are calling SetParent 8000 times, so things add up to your load time being roughly 2x slower in total. SetParent time with my changes went from 7000ms to 80ms. Changing your game code took me roughly 5 minutes.
It was good to check out this project folder, and I will write a blog post with examples of the corner cases you can now run into when doing lots of parenting. But since this is a perfectly solvable problem and the new system gives huge speedups in other places, we see this as working as intended.
Yep, with the explanations given in this thread it makes perfect sense and, as you said, is trivial to fix.
Thanks for taking the time to make sure it is working as intended (on a Saturday!).
@Sebioff
I am very curious if you have been able to adapt your code to work optimally with the new SetParent behaviour and if you were able to get performance to be as good or better than 5.3?
@Joachim_Ante_1
We haven’t done the full switch to 5.4 yet, but yes, at least the immediately apparent issues can be easily fixed by removing any unnecessary parenting and keeping hierarchies as shallow as possible, which seems to result in about equal performance to 5.3. Not sure how easily that can be achieved in practice for other games, but at least for us it shouldn’t be a big problem.
I have to say though that I never noticed a performance issue with SetParent in 5.3 in the first place, so I was curious and tried the following code to get a direct comparison:
void OnGUI() {
if (GUILayout.Button("Benchmark")) {
Profiler.BeginSample("Ordered parenting");
GameObject root = new GameObject("Root");
for (int i = 0; i < 20000; i++) {
new GameObject("Child").transform.parent = root.transform;
}
Profiler.EndSample();
Profiler.BeginSample("Random order parenting");
for (int i = 0; i < 20000; i++) {
new GameObject("Child").transform.parent = root.transform.GetChild(Random.Range(0, root.transform.childCount - 1));
}
Profiler.EndSample();
#if UNITY_5_4
Profiler.BeginSample("Ordered parenting + capacity");
root = new GameObject("Root");
root.transform.hierarchyCapacity = 20000;
for (int i = 0; i < 20000; i++) {
new GameObject("Child").transform.parent = root.transform;
}
Profiler.EndSample();
#endif
}
}
On a standalone build running on a 2014 Mac Mini I get the following results:
5.3.5p2:
So unless I made a mistake in my test or there are some other benefits to the hierarchy change the 5.3 behaviour is better (no huge penalty for not appending to the hierarchy end + even slightly faster overall).
(Slightly off-topic, but apart from that 5.4 is performing much better for our game Especially rendering time improved a lot, probably mostly due to instancing)
I’d like to add my voice to this - I, too, have never noticed any performance issues with SetParent until 5.4, when I’ve suddenly got multi-millisecond spikes for operations which were previously insignificant (many of which are in UI, as @mdrotar mentions).
I’d much rather have SetParent perform pretty well in all cases, rather than amazingly in one specific instance and poorly otherwise. I worry that this might be a lab-conditions optimisation that causes huge problems in real-life projects.
Would it be an option for you to submit your project, or a stripped down version, that illustrates the issue and post the case # here? This would allow Joachim to take a look at this use-case and hopefully come up with a fix.