Quick question about baking optimization

Hi guys,

I don’t know if this is the proper place to ask this, so mods please move the thread if it isn’t and I apologize in advance for any inconvenience.

My question is as follows:

We are doing some Unity 5 bakes that require high-quality end result. To do so, we cranked-up the settings a bit:

  • 10k final gather rays
  • 360 textel res
  • 2048 resolution atlas

Now, we have the bake going for about 24 hours now, and it’s going smooth, but I’ve noticed that in terms of performance he is doing this to our bake machine:

  • Processors are at full load (100%), which was expected

  • RAM usage is at 31 GB (the machine has 32), very unexpected

  • I have two Unity Job processors running at 9.8 GB each (almost 20 GB total)

So the questions are:

  • Why is Unity consuming that much RAM? I mean, why does the bake engine wants to use that much?
  • If we had 64 GB would it use it all as well? And would it go faster if we provide him with it?
  • What would happen to our bake if we only had for instance 8GB? Would it just stall and crash, or would just take longer to perform?
  • If we replaced the current processor (intel 4700 MQ ) with two Intel Xeon E5 processors (6 cores, 12 threads per core, total 12 cores and 24 threads), would baking be faster?

I’m asking because we are going to need to do some high quality baking in the future, so it would be helpful to know what to expect in terms of performance and hardware requirements.

Also, if someone can provide me with an basic overview of how Unity 5 bake engine works in terms of performance and hardware usage, and also what are the impacts of settings vs end-result. It would be superb!

Thanks in advance for the help :wink:

I’m eager to see a reply from UT to this question, and especially the RAM usage part.

x2 as i want to buy a i7 5960x or a xeon and dont know if it will work and if this is justified.They said that unitys 5 life is expected to be between 1.8-2 years and we have it since more than 5 months if not mistaking.Thats a big % of time…and im not planing on stayng to the end to find out.Woud be nice to know if this is unitys team problem or geometrics.

I have 64gb / 5960X and unity on some scenes will still swap to disk. we are considering a larger machne (128 - 172gb ram) to see how it goes, but there is no guideline on ram that i have found.

  1. It is expected with the resolution your are requesting. The memory usage goes up when you increase the resolution because the geometry in your scene is subdivided.

2 and 3) If 31 Gb is used and you have 32Gb you will not get an improvement from installing 64Gb. If the memory footprint exceeds what can fit in available memory, the OS will start swapping and baking will slow down. So make sure you have enough RAM. Don’t disable the OS swapfile - the OS or Unity can becomes unstable when you run out of memory.

  1. Yes. Getting more CPU power will scale well as long as there are enough baking jobs to execute. In practice, all the CPU cores will be utilized until the end of a stage, e.g. Light Transport, where we have to wait for the last job to finish.

Btw. if you have a large plane with objects placed on top, you can get lower memory usage by subdividing the plane into smaller segments. This is because each group (system) of objects is smaller and will have fewer dependencies on the other systems.

Note that long thin triangles will be harder to tessellate when doing the square clusters in the clustering stage.

In some case (large scene with precomputed realtime resolution at 2) with 8GB of memory I had Unity crashing with “out of memory” crash report, is it expected ?

If you disabled your swapfile then the operating system can’t provide more memory to Unity than what is free in RAM, and then it can happen. We are not supporting disabled swap file. Otherwise, it shouldn’t happen on a 64 bit system.

Hi guys!

Time to report back on these questions :wink:

@KEngelstoft : Thank you for the clarifications! We ended up buying the two 6 core xeons plus 64 GB of DDR4 ram. Expensive setup, but we got a nice boost, now bake times are around 1/3 to 1/4 of the original time.

One thing I forgot to mention, the scenes we are working on are a bit heavy on the geo, we must have well over 30M tris.

As for resource usage with the new setup:

-It will clip the 12 cores at 100% for a while on some stages (noticed at Light Transport and Final Gather), usual is around 20%-60%, with some 100% peaks, but those are instant.

-RAM has gone up to 36,6 GB, but no higher

-HDD access is usually around 12% max on some stages, a bit more when he writes the maps to disk of course.

And now I have a couple more questions:

-As I understand your explanation and for what I’ve read, lightmap baking is basically a render operation. If that is correct and considering our current setup, we must have our page file on a fast disk (such as an SSD or 10k RPM drive). What sort of disk R/W operations should we expect besides page-file and map writing and what impact would they have on our baking times?

-What would be the recommended size (poly-wise) per geo for a good quality/bake time balance? I’m asking because we are working with ArchViz and sometimes clients send us “render models” with over 500k polys, that are a pain to import, specially if we have to use the “generate lightmap UV” setting. We are working on a optimization pipeline for this, since that type of geometry gets unmanageable very fast. So it would be great to get some pointers on a “safe” threshold for model complexity, this way it would be easier to balance detail and performance.

-We have a 6GB DDR5 card on this machine, and just by opening this complex scene, we get a 80% card memory usage rate. This is basically due to the shaded viewports or does Unity “reserves” this memory for the play-mode as well?

-I’ve noticed that if one “breaks” all the prefab connections on a scene, baking times are reduced by whole lot. I don’t want to be misleading, but we tested this specific situation and before prefab break, the times were about 60 to 70 seconds, after break, 9 to 10 seconds. We have a number of theories on the team about this behaviour and It could be vital to know the exact reason. Can you clarify this for me? Why does breaking a prefab connection makes baking faster?

-You’ve mentioned subdividing surfaces to create more systems with less dependencies between them, I feel that this operation could be very interesting for solving other issues we are having. Could you clarify about when should this be considered, e.g at what size is a plane “too big” (in Unity squares)?

-About the long triangles tip you gave me, I can assume that the more long triangles I have on a mesh, the harder tessellation will be, thus more expensive will the bake get (and the end-result will be more unpredictable), is this correct?

-Finally and about tesselation: this is the stage by which unity “wraps up” geometries in a imaginary, gapless grids that will become the lightmaps, correct? Could complex geometries lead to tessellation errors, that in turn could produce the famed “blotching” issues (by overlapping or misplacing objects in the map for instance)? Or this a completely different issue?

Again, thank you for the time and for the help!

Just to clarify: I’ve never disabled my swap file…

Putting the page file on a fast disk should not be necessary as long as you have available space in RAM, it is only important if swapping occurs (if you run out of physical memory).

The jobs running in the external Unity Job Process are using disk I/O when starting and finishing, but this should not be the limiting factor for baking times.

The memory usage you see on the graphics card is not impacting or related to lighting bakes. Baking is only using the GPU for a short amount of time when rendering albedo and emissive into low resolution lightmaps, one map at a time. Afterwards, the lightmaps are freed so I think what you are seeing is memory taken by the scene viewports, render targets, meshes etc…

Polygon count is very platform dependent when it comes to rendering; on Xbox 360 hardware, the sweet spot is around 4x4 pixels per triangle when rendering. On newer hardware, 2x2 pixels per triangle is ok. If you are doing super high end offline rendering you are moving into several triangles per pixel territory with MSAA on top. For baking this is not really transferable, as the calculations happens in lightmap space. Clustering will be faster when you have fewer triangles but the rest of the baking pipeline should not be affected much.

I can’t give any specific tips in Unity units, as every scene is different, but imagine a large terrain with a race track on top. Since these two meshes are really big and affect each other, this could take a long time to compute. If you instead cut it up into some smaller chunks, each chunk will have fewer triangles in its dependencies nearby.
Another example could be a house on a big plane. Cutting away a section of the plane 20 meters larger than the house and replacing it with a small plane to take its place, will enable you to decouple the systems and use different lightmap scale on the far away plane and the part close to the house where the shadows need more resolution.

About performance when breaking prefab connections, I never heard of this before - it does sound very interesting. Can you provide a scene with an object where this happens?

-About the long triangles tip you gave me, I can assume that the more long triangles I have on a mesh, the harder tessellation will be, thus more expensive will the bake get (and the end-result will be more unpredictable), is this correct?
Yes, except that the result should be predictable even with different mesh topology.

Blotches should not be caused by tessellation, it sounds like a different issue. Can you show a screenshot?

@KEngelstoft : Hi there, thank you for all the help!

Your tips are very helpful and we will be testing these ideas over the next days.

About blotching, I will send you a couple of screenshots on my next post. One thing about these is that they are not consistent, meaning that they can vary from bake to bake (or not) and also with resolution changes. I’ll send you a couple of examples.

About the scene, sure, we’ve setup a test scene just for validating this “discovery”, so, I can send it to you. How do I do that, I just add a link to it here?

Again, thanks for the help :wink:

I have sent you a PM about how to upload the project.