string.Substring creates garbage?

Howdy folks,

I’m trying to clean up garbage creation in my JSONObject class (available for free on Asset Store :slight_smile: and I am finding that some of it is coming from calling string.Substring. As you can probably imagine, I kind of need to do substrings when I parse the input string. I can’t really think of a way to do it without allocating more temp strings, or some kind of refactor where I consume parts of the string as I parse it, but I’m not even sure that that would eliminate garbage creation either.

This is one of the the last places I’m getting garbage (also Convert.ToSingle I think) so I’d like to clean it up

that’s because a string is immutable (basically means you cant change a string, you have to re-create it)

you may be able to get away with using string builder depending on what youre doing.

No way to avoid string.SubString from creating garbage. You could convert the string to a character array (ToCharArray) and parse that. Depending on how many substrings you need that could be a ton of work for very little benefit, or it could reduce your garbage generation a fair amount.

Strings are always immutable in .Net/Mono though - there’s no way around that.

Off topic… How can you tell that garbage is being generated? Can you get a report of what objects Garbage Collection is going through?

In Unity Pro the profiler tells you about allocation.

To be clear about angrypenguin’s response, it’s the GC Alloc colunn in the Profiler. It’s not exteremely helpful off the bat unless you use Deep Profiling or call Profiler.BeginSample where you suspect Garbage is being crated.

So to be more specific about my case:

My parser counts up from 0 to string length and essentially cuts out the pieces around the format specifiers ( {,}[ ]:" ) and builds objects around them. So I end up with things like

propName = str.Substring(tokenTmp + 1, offset - tokenTmp - 1);

Is there no better way to do this to avoid creating garbage?

Not without grossly complicating your code, and probably ending up with worse performance. In any modern language, you’re going to have a ton of little objects like strings that get created and destroyed a lot. But the language implementors know this and generally optimize those pretty heavily already; in most real apps, it ends up being MORE efficient than typical (nontrivial) C code that does similar things, because in C you often end up copying strings a lot just to be sure that nobody mucks with your data.

So my suggestion is to quit worrying about this, and go make a cool game that’s fun to play!

if you’re not doing it in update, its really not much of a problem…

you can not completely avoid the GC

You may want to look into using regular expressions (RegEx) to do your matching… it’s not going to completely eliminate allocations but you may be able to drastically reduce them and get away from using Substring.

Well it’s not called every frame if that’s what you mean. I have a sync coroutine that assembles a web request, does it, waits 5 (or whatever) sec, and does it again. So it is still happening in-game, and thus there are GC hitches every few seconds. It’s not terrible, but I’ve gotten rid of most other garbage, so this last one is bugging me! To that note:

Don’t worry, the game is fun! This particular component is also on the AssetStore (fo free), so I figured that I owed it to its users to try and make it as good as can be.

Out of curiosity, what kind of regex do you think would work here? Can you search based on character index? Or would you suggest detecting the "s and : and doing a regex that way? The way I’m parsing this, the top-level object uses the whole string at once (since it can still have properties at the end of the string), so I’m worried that doing a regex here would in fact chew up more CPU cycles.

What I’m hearing is that this is, in fact unavoidable. Le sigh.

P.S. Dustin, than’s for helping out you competition! :wink: