What is the performance delta between old Unity Mono and New?

Q1: What is the performance delta between old Unity Mono and New, do you have any benchmarks you can share?

Hint: UT’s WebGL benchmark could probably be used to highlight the performance improvements.

Really excited to see this.

Q2: Will Unity be fine tuning SGen for optimal game speed and minimal impact on performance or will we have access to the SGen options?

We don’t have anything to share now. We’ve not started on significant benchmarking internally, but it is next on the list. We will indeed be using the WebGL benchmark.

For this preview build, we’re still using Boehm, not SGen. We had used SGen in a debug mode for a previous preview build, but other time constraints indicate that we will probably ship the first official Unity release with a new .NET runtime using Boehm, not SGen. There is significant work in the Unity code base necessary to use a moving GC like SGen. That work is currently progressing, but we don’t want it to hold up the new Mono runtime release. So we have decoupled the runtime upgrade from the GC upgrade.

We’re all looking forward to a better GC, but this seems like the best way to get new .NET profile support fastest.

2 Likes

What if SGen didn’t move data, just managed it. Moving data from one location to another on current generation CPU’s seems like a waste of time and effort as memory is the slowest component in most systems and well it’s called Random Access Memory for a reason.

So instead of moving actual data just allocate the data to a given virtual region e.g. Nursery, Majour Heap, Large Objects.

The only aside to this would be contiguous block allocation for larger objects or collections.

Have you considered smart object pooling built into the Unity API, if you make pooling a formal but optional part of Unity then the Unity game engine can be provided information on the dynamic potential of objects used in the game and better manage Sgens options per a game and device.

Or Sgen is complex and Unity can deploy to a range of devices/memory/cpu scales but without explicit developer driven information on a games memory profile how can you maximise/optimise the performance of the GC?

Or Chicken and Egg. Chicken = Optimal Sgen Settings / Egg = Game running on Sgen.

2 Likes

Side question could Unity harness SGen Dtrace in development mode to help optimise the Sgen settings?

Think of it as GC Profiling.

1 Like

I should have been more clear, sorry. In addition to dealing with a moving GC, Unity also needs changes to work with a generational GC. So even to get SGen without moving, we still have work.

We’ve not considered this that I know of. It does sound like possibly a good approach though.

I think that we will end up offering configuration options. But, I’m speaking now about something we’ve not actually tried yet (exposing GC configuration options), so I can’t say with much certainty.

Yes, I think this would be possible.

In general, our approach is two-fold:

  • First, work toward allocation-free APIs. It is better to avoid allocations, no matter how good the GC is.
  • Second, get to a GC where we can have more stable pause times, especially in cases where we generated very little garbage.

This is future work though, not directly related to the what will be the first release of the new Mono runtime, I think.

3 Likes

Could a Unity GC be configured to only run during the wait for vsync gaps when a game is running?

1 Like

Yes, this is something we are considering. We would like to have a GC which you can tell “Run for Xms now, then return”, for some value of X that makes sense for your game. We don’t have that yet, but I think it makes sense for the game development domain.

5 Likes

Pardon my ignorance but does this mean IL2CPP and the new mono runtime are both still running on boehm? And fundamental changes need to be made to Unity before generational GC is working?

Yes, this is correct.

No, the changes in Unity are not fundamental, but they do involve a good bit of the existing Unity code. Basically, it is possible for native code in the engine to modify managed objects without properly information the GC. When we use a generational GC, this can be a problem, since the GC may not scan the entire heap each pass.

We don’t think there are any issues with this, but we need to build in the safety to prevent any problems a compile time. That work is currently underway, but is not ready yet.

So to get the new Mono runtime and .NET profile out earlier, we’ve decided to stick with Boehm for the time being. We are planning to use a different GC in the future though.

4 Likes

So we have updated mono runtime already in current experimental build? I thought it’s compiler upgrade only.

@Roni92pl

Yes, the experimental build here: https://forum.unity3d.com/threads/upgraded-mono-net-in-editor-and-some-players-on-5-6-0b5.454387/ uses an updated Mono runtime that supports the .NET 4.6 profile. The upgrade C# compiler has shipped with Unity 5.5, and is fully supported.

2 Likes

How awesome is that! :wink:
I see nobody reported any performance difference so here you have some fast comparison:
test case: adding 0.1 double 10 milion times in for loop. Players are 64bit windows standalones.

5.3.7:
editor: 70ms~
player: 33ms~

5.6.0b5 new mono:
editor: 50ms~
player: 65ms~

Given times are very rough average numbers taken from Unity’s profiler, don’t take this too serious.

Out of curiosity I ran a test - a half-minute flight over a terrain generated on the fly from 3 octaves of 3-dimensional simplex noise that is triangulated using surface nets algorithm. The results are pretty accurate and easily reproducible.

Windows Standalone:

5.6 beta 7
x86 - 53.9 ms
x64 - 66.0 ms

5.6 beta 5 mono upgrade preview
x86 - 47.7 ms
x64 - 64.3 ms

It’s how long it takes in average to create from scratch geometry (lists of vertices, indices, etc) for a 32x32x32 chunk of the terrain.

2 Likes