IL2CPP interaction with non ASCII characters

Unity IL2CPP converts C# code to C++ code, but the 2 languages handle chars differently. Specifically, C# counts Russian, Chinese, Tamil, Hindi etc letters as 1 char, but C++ 1 char is always 1 byte, causing these non-English characters to take up more than 1 char. so does IL2CPP account for this?

will the following code output the same thing when built to Mono and IL2CPP?

string s = “Привет, мир”;
UnityEngine.Debug.Log(s.Length.ToString() + " " + s[7].ToString());

IL2CPP does not use the C++ char type to represent C# strings. It stores them as UTF-16 characters just as .NET does. So yes IL2CPP does account for this and your code will output the same value in both Mono and IL2CPP.

4 Likes

@ScottHFerguson
That was my intuition, but I had to wait for someone who knew this for a fact. Thanks for the reply.