(I also posted this in another thread, but as requested made a thread for this too.)
As the title states the TMP_CharacterInfo.index property doesn’t have the right value when using surrogate pairs (such as emojis).
Bug report:
https://fogbugz.unity3d.com/default.asp?1037828_orjvblcsrigrauhf
Using Unity 2018.1.0 with TextMeshPro 1.2.2 package
Thank you for the post and submitting the bug report.
To provide additional insight on the issue,
When using a string such as string s = “A\U0001F60AB”; which ends up being converted to “A\ud83d\ude0aB” by C#.
The first text element is A at characterInfo[0].index = 0; which is correct.
The emoji is characterInfo[1].index = 1; which points to the high surrogate pair which is correct.
However the letter B, characterInfo[2].index = 2; is incorrect as it points to the low surrogate pair. The characterInfo[2].index should equal 3.
I’ll provide feedback as soon as I have an update.
I have finally had the time to circle back on this issue which will be fixed in the next release of the TMP package for Unity 2018.3 which will be version 1.4.0.
At the same time, I also made changes to the TMP Input Field to add proper handling of Surrogate Pairs.
Awesome, will check it out when it gets released.