StringBuilder generating garbage .NET 4.x but not in .NET 3.5

Very recently I went for .NET 4.x as .NET 3.5 is now depreciated.

I had the bad surprise to see that some StringBuilder operations now generate garbage, where they didn’t previously.
The most trivial one is when you clear a StringBuilder, it is done now as it was before, by setting ‘Length’ to 0, but now it may generate some garbage because when there are several chunks the old ones are removed and a large new one is created (in order to keep the same capacity).

It also seem that ‘Append’ generate more garbage than before, but it wouldn’t be such a problem if ‘Clear()’ wasn’t generating any.

Even just creating a StringBuilder is enough to generate garbage:


For information, the StringBuilder created here are empty, here is the code creating them:

for (Int32 i=PoolLength; i<NewLength; i++)
{
    NewData[i] = new T();
}

Where ‘T’ is ‘StringBuilder’ of course.

I think that having StringBuilder generating more garbage than in the previous framework is a step backward.

Can you submit a bug report on this issue?

I’m not sure it’s a bug.
I’ve looked into Microsoft’s code for the ‘StringBuilder.cs’ file, and it does generate garbage.
It comes from how the chunks are managed internally.

If Unity is willing to replace the ‘StringBuilder’ implementation with one which is garbage-free then I’ll be delighted.
My post’s goal was mainly to find out if some people had GC-free alternative to StringBuilder, in order to avoid re-inventing the wheel.

But if you confirm that it is to be considered as a bug, then I’ll post a bug report with a repro-project showing different cases where GC is generated when it could be avoided.

We’ll need to investigate it a bit more, but we would like to avoid generation of garbage here if at all possible. The best way to handle this is via a bug report, even if this is not strictly a bug.

OK, I’ve sent the bug report, the number is '1123117’.

If someone is interested, here is the test script (just place it on the camera of an empty scene, start the project, wait 5 frames and then take a look at the profiler).

using System;
using System.Text;

using UnityEngine;
using UnityEngine.Profiling;


public class Test : MonoBehaviour
{
    static StringBuilder EmptyTest;
    static StringBuilder First;
    static StringBuilder Second;

    void Start()
    {
        // .NET 3.5: 48 bytes
        // .NET 4.x: 112 bytes
        // It shouldn't generate any garbage, only allocate some memory
        Profiler.BeginSample("StringBuilder empty constructor");
        EmptyTest = new StringBuilder();
        Profiler.EndSample();

        // .NET 3.5 total GC:  48 bytes
        // .NET 4.x total GC: 112 bytes




        // .NET 3.5: 48 bytes
        // .NET 4.x: 112 bytes
        // It shouldn't generate any garbage, only allocate some memory
        Profiler.BeginSample("First StringBuilder constructor");
        First = new StringBuilder("Small");
        Profiler.EndSample();

        // .NET 3.5: 90 bytes
        // .NET 4.x: 0 bytes
        Profiler.BeginSample("First StringBuilder small init");
        First.Append(" Test");
        Profiler.EndSample();

        // .NET 3.5: 0 bytes
        // .NET 4.x: 112 bytes
        Profiler.BeginSample("First StringBuilder small concatenation");
        First.Append(" - Adding");
        Profiler.EndSample();

        // .NET 3.5: 0 bytes
        // .NET 4.x: 96 bytes   // Shouldn't be generating garbage
        Profiler.BeginSample("First StringBuilder clear");
        First.Clear();
        Profiler.EndSample();

        // .NET 3.5: 0 bytes
        // .NET 4.x: 0 bytes
        Profiler.BeginSample("First StringBuilder 2nd small concatenation");
        First.Append(" - Adding");
        Profiler.EndSample();

        // .NET 3.5: 0 bytes
        // .NET 4.x: 0 bytes
        Profiler.BeginSample("First StringBuilder 2nd clear");
        First.Clear();
        Profiler.EndSample();

        // .NET 3.5 total GC: 138 bytes
        // .NET 4.x total GC: 320 bytes




        // .NET 3.5: 48 bytes
        // .NET 4.x: 150 bytes
        // Note that the initial size of the text put in the StringBuilder constructor changes something, which is not normal at all, there should be only allocation, no memory release at that point
        Profiler.BeginSample("Second StringBuilder constructor");
        Second = new StringBuilder("Large initial Stringbuilder content");
        Profiler.EndSample();

        // .NET 3.5: 166 bytes
        // .NET 4.x: 150 bytes
        Profiler.BeginSample("Second StringBuilder small init");
        Second.Append("Test");
        Profiler.EndSample();

        // .NET 3.5: 306 bytes
        // .NET 4.x: 220 bytes
        Profiler.BeginSample("Second StringBuilder large concatenation");
        Second.Append(" - Adding a lot of character to a StringBuilder shouldn't be a problem for the garbage collector");
        Profiler.EndSample();

        // .NET 3.5: 0 bytes
        // .NET 4.x: 312 bytes      // Not normal, it's a 'Clear()'
        Profiler.BeginSample("Second StringBuilder clear");
        Second.Clear();
        Profiler.EndSample();

        // .NET 3.5: 0 bytes
        // .NET 4.x: 0 bytes
        Profiler.BeginSample("Second StringBuilder 2nd large concatenation");
        Second.Append(" - Adding a lot of character to a StringBuilder shouldn't be a problem for the garbage collector");
        Profiler.EndSample();

        // .NET 3.5: 0 bytes
        // .NET 4.x: 0 bytes
        Profiler.BeginSample("Second StringBuilder 2nd clear");
        Second.Clear();
        Profiler.EndSample();

        // .NET 3.5: 0 bytes
        // .NET 4.x: 360 bytes
        Profiler.BeginSample("Second StringBuilder very large concatenation");
        Second.Append(" - Adding a lot of character to a StringBuilder shouldn't be a problem for the garbage collector - Adding a lot of character to a StringBuilder shouldn't be a problem for the garbage collector");
        Profiler.EndSample();

        // .NET 3.5: 410 bytes      // That's very strange, no garbage with the previous concatenation, so why does this 'Clear()' generate garbage?
        // .NET 4.x: 0.6 Kilo-bytes
        // It's a 'Clear()', we shouldn't be generating any garbage
        Profiler.BeginSample("Second StringBuilder 3nd clear");
        Second.Clear();
        Profiler.EndSample();

        // .NET 3.5 total GC:  930 bytes
        // .NET 4.x total GC: 1806 bytes
    }

    Int32 FrameCounter = 0;
    public void Update()
    {
        if (FrameCounter > 5)
        {
#if UNITY_EDITOR
            UnityEditor.EditorApplication.isPlaying = false;
#else
            Application.Quit();
#endif
        }

        FrameCounter++;
    }
}

#if NET_LEGACY
public static class StringBuilderExt
{
    public static void Clear(this StringBuilder this_)
    {
        this_.Length = 0;
    }
}
#endif
1 Like

As anticipated, Unity has classified this as being ‘By design’.

I think that it’s a shame.
When using StringBuilder, one is expecting to avoid GC, and unfortunately it only reduce GC compared to ‘String’, but it increase it compared to the ‘StringBuilder’ from .NET 3.5.

Is there a way to use our own custom implementation of StringBuilder rather than the one in Unity?

You can easily copy in and modify, or write your own version of StringBuilder, and use that in your project.

I’m afraid that won’t work.
There is no point in having a custom ‘MyOwnStringBuilder’ class if I cannot use these ‘MyOwnStringBuilder’ objects in Unity.
For example, TextMeshPro can take a StringBuilder as an input in the ‘SetText()’ method, but it will not be able to take a ‘MyOwnStringBuilder’ text as an input, so I will have to create a normal ‘StringBuilder’, and then provide it to TMP.
But by doing that, I will generate garbage, because the standard ‘StringBuilder’ generate garbage.

In addition, ‘StringBuilder’ is sealed (and has no virtual methods anyway), so I cannot inherit it and modify the methods I need.

For now, the only way I found to use a custom StringBuilder is to replace the original methods using assembly to insert a ‘JMP’ instruction at the very beginning of the original method in order to call my own custom method.
While it will certainly work, I would prefer to avoid using such extreme solutions…

Unfortunately, unless if we have a way to recompile some system dlls, I don’t really see how to have a custom StringBuilder which can be used directly in Unity.

True, you cannot pass your own StringBuilder implementation to APIs expecting a System.Text.StringBuilder. For the general use case of building strings you can use your own implementation if you wish.

However, the implementation of StringBuilder in the new scripting runtime comes from the .NET reference source. It does have different allocation behavior than the old one written within Mono. The heuristic for growing the character buffers is different, and consolidates multiple buffers on Clear. You can avoid this issue by allocation a larger initial buffer for the StringBuilder. For example, if I use initial size of 64 for First and 256 for Second in the test case, I get less allocation than on old .NET.

That’s true, the garbage generated on ‘Clear()’ is only done once, at least as long as we do not exceed the capacity again.

But it’s still a shame that creating a StringBuilder (or clearing it in some situation) generate GC.
And I haven’t even looked at the insertion/removal of characters in the middle of the StringBuilder.

I guess that a way to solve the problem would be to reuse the StringBuilders with a pool, after some time all the pooled StringBuilder would have a large capacity and wouldn’t generate GC anymore.
But putting them back on the pool would add a lot of complexity.