Using strings still bad in 4.6 ?

I’ve read quite enough topics about using strings not being good practice in unity in the older versions because of some bugs from … So my question is is it okay now to use strings in unity 4.6 or still not ? If not are there any occasions in which I can use them without getting hurt ? Also if still not reccomended to use strings any easy way to workaround ?

EDIT: I use strings for loading scenes, resources, comparing gameObjects, storing strings in Collections as data for later usage in the game.

EDIT: Might be interesting to try out GC free string: http://forum.unity3d.com/threads/gstring-gc-free-string-for-unity.338588/


Something I forgot to mention in my original answer:
Is the intern table/pool.

The common language runtime conserves string storage by maintaining a table, called the intern pool, that contains a single reference to each unique literal string declared or created programmatically in your program. Consequently, an instance of a literal string with a particular value only exists once in the system.

Which means:

string h = "Hello ";

^ doesn’t allocate a new string and assign it to ‘h’. Because "Hello " is a string literal thus it’s already interned (exists in the intern pool) so we just get the reference to it and assign it to ‘h’.

Similarly this also doesn’t allocate:

string hw = "Hello " + "World!";

This is a concat between two literals, since both "Hello " and “World1” are both ‘interned literals’, the result of their concat is also interned.

Proof:

bool isInterned = string.IsInterned(hw) != null;
Debug.Log(isInterned); // true

Same thing here:

const string h = "Hello ";
const string w = "World!";
string hw = h + w;

‘h’ and ‘w’ are constant string literals, and it would make sense that constant + constant = constant. So the result of h + w is also interned and it won’t allocate.

bool isInterned = string.IsInterned(hw) != null;
Debug.Log(isInterned); // true

Now take a look at this:

string h = "Hello ";
string w = "World!";
string hw = h + w;

Now we’ve seen that doing "Hello " + “World!” doesn’t allocate, we also saw that ‘h + w’ where h and w are constants doesn’t allocate as well but in this case ‘h’ and ‘w’ are not constants, they’re just two references referencing two interned strings "Hello " and “World!”, but h + w will now allocate because the clr cannot make any assumptions here since ‘h’ and ‘w’ are not constants, nothing’s preventing the programmer from changing their value after their assignment and before the ‘h’ + ‘w’.

Here’s a fun little instructive hack:

fixed(char* ptr = "Hello")
    *ptr = 'X'; // or ptr[0] which is the same as *(ptr + 0) for those unfamiliar with pointer math :p

string test = "Hello";
Debug.Log(test);

Can you guess the output? and why? :stuck_out_tongue: (tip: make sure you understood the intern table)

NOTE: For unsafe code to compile in Unity, you need to add two files under your Assets folder: “smcs.rsp” and “gmcs.rsp” both with the “-unsafe” argument in them. Restart Unity. You could also try it in a Console Application with ‘allow unsafe code’ in the project properties.


Well first thing, you have to understand why are these topics you’re reading saying that “strings are bad”. That statement alone without any context is very vague, and well, stupid… Strings are essential to games, scores, health, dialogues etc, so you can’t avoid them.

Two things to understand about System.String:

  1. It’s not a struct, it’s an object (reference type, not value type. difference?). That means when you create new strings, they’re allocated memory in the heap, not stack. Just like any other object, when you lose a reference to that the GC has to collect that object later.

  2. Strings are immutable. Meaning when you do string operations like .Remove and .Concat, those methods take the input you provide, do whatever operation they’re supposed to, and instantiate and return a new string object with the resulting value of that operation. That’s why you don’t see those methods with a void return.

With that said, take a look at:

void Update()
[
   string playerScoreString = "Player: " + player.Name + " has: " + player.Score.ToString() + " points";
   scoreText.text = playerScore;
}

The + operator compiles into a Concat call. So we could write:

string playerScoreString = "Player: ".Concat(player.Name).Concat(" has: ").Concat(player.Score.ToString()).Concat( "points");

Which obviously is creating a lot of string objects and at the end of the Update scope there’s no one referencing them so they’re now garbage for the GC to collect. Since this is update, this is happening each frame, thus more load on the GC, since the GC in Unity is retarded, the more load on it the more effort it has to do to clean up, more effort == more CPU cycles and thus a hit on performance at the end of it all.

The solution is simple in most cases. Most the times you don’t need to deal with strings in Update like that. Just modify your design a bit, update the score text only if the score really changes.

void Update()
{
   if (player.Score != currentScore)
   {
      playerScoreString = ...;
      currentScore = player.Score;
   }

   scoreText.text = playerScoreString;
}

Another technique you could use is to memoize the results. Let’s assume that foreach unique player there’s an expensive method that returns a string, that string we must use in Update.

public class Player
{
   ...
   public string DoExpensiveOperation() {....}
}

void Update()
{
   string expensive = player.DoExpensiveOperation();
}

Well, there’s many things you could do for one you could just create a string list on start and perform those operations and store the results in that list. Or just memoize the results.

Func<Player, string> doExpensive;
void Awake()
{
    doExpensive = new Func<Player, string>(p => p.DoExpensiveOperation()).Memoize();
}

void Update()
{
  string expensive = doExpensive(player);
}

This will basically perform the expensive operation associated with that player, and cache the result. So that the next time you call doExpensive on the same player, you would just get the cached result, no effort. You could do this not just on strings, but any expensive operation.

Memoize is an extension method. There’s a version that takes no argument, and another with one argument. Gets more tricky if the arguments are 2 or more.

	public static Func<TResult> Memoize<TResult>(this Func<TResult> getValue)
	{
		TResult value = default(TResult);
		bool hasValue = false;
		return () =>
		{
			if (!hasValue)
			{
				hasValue = true;
				value = getValue();
			}
			return value;
		};
	}

	public static Func<T, TResult> Memoize<T, TResult>(this Func<T, TResult> func)
	{
		var dic = new Dictionary<T, TResult>();
		return n =>
		{
			TResult result;
			if (!dic.TryGetValue(n, out result))
			{
				result = func(n);
				dic.Add(n, result);
			}
			return result;
		};
	}

Very useful technique that with the right use, could yield you very good performance with little to no effort. I use it quite heavily in both editor and runtime codes.


Going back to “strings are bad”. There’s another interpretation. IMO, I believe that dealing with string literals is bad, but a different type of bad, more a “design thing” kind of bad. Take for example StartCoroutine that takes a string of the routine to run, if you hardcode that in and then later on refactor your code you’re gonna have problems. You have to look for all the string literals instances where you had that method mentioned, and change them as well. Same thing with GameObject.Find, variable names of an Animator, axis names, tags, etc. Programming against string literals is very error-prone and could bite you hard.

There’s many ways to avoid string literals, take tags for ex you could write a static class that contains all the tags in your game as string constants and then you refer to that class when you want any tag:

public static class Tags
{
   public const string Player = "Player";
   etc...
}

Finding gameObjects? well, just don’t use GameObject.Find, instead just dragdrop the gameObject you’re interested in to your behaviour in the editor…

Animator variable names and axis? Well, for me I use drawers and popups. (AnimVar, InputAxis) See VFW for more toys like these.