Full Emoji Support Api (emoji Sequen

Hello guys, i want to share this improvement API to TextMeshPro

This package will include all you need to support all Emojis of Android/iOS in you app (Unicode Emoji 12.0)

The original TextMeshPro use unicode characters to try map Emojis, but this is problematic because in Android we have a huge amount of Emojis that use Char Sequence.

Thats why you need this API.

CURRENT VERSION: 1.1.5

NEW INSTALL GUIDE (Package Manager Unity 2019):

  • Delete old versions of this API inside your Assets folder
  • Add following link inside manifest.json (Packages\manifest.json)
{
    "dependencies": {
        "com.kyub.emojisearch": "https://gitlab.com/KyubInteractive/emojisearchapi.git#com.kyub.emojisearch-1.1.5",
        ...
    }
}

FAST GUIDE:

  • Replace your TextMeshProUGUI component to TMP_EmojiTextUGUI component in the GameObject

4420162--403039--upload_2019-4-12_1-29-4.png

  • Enable RichText (we need it because we use <sprite=index> tag to map the emojis).

  • Add in TMP SETTINGS (or in your TMP_EmojiTextUGUI) the spriteasset in correct format
    (generated by Sprite Importer)

4420162--403066--upload_2019-4-12_1-53-24.png

PS: Already added the EmojiData_google in this project so you can use it.
(License Apache License 2.0)

This SpriteAsset contains all emojis from Android 9.0 in 32x32 per emoji format.

USING CONVERSION TOOL TO GENERATE SPRITEASSETS (NEW GUIDE):

  • Download a JSON and a Spritetexture from GitHub - iamcal/emoji-data: Easy to parse data and spritesheets for emoji
  • Access in Unity the tool ā€˜Sprite Emoji Importerā€™
    (Path: ā€œWindow/TextMeshPro/Sprite Emoji Importerā€)
  • Drop the JSON downloaded from emoji-date in the tool and configure the size/spacing/padding of the grid in SpriteSheet (default size 32x32, with spacing 2x2 and padding 1x1)
  • Check the Import Format to ā€œEmoji Data Jsonā€
  • Hit Create Sprite Asset
  • Save your SpriteAsset and use it in TextMeshPro (yey!)
    4420162--555180--upload_2020-2-4_17-25-49.png

ENJOY the Full Unicode Emoji 14.0 Support! YEY!

ABOUT SEARCH ENGINE IMPLEMENTATION:

  • The emoji sequence will be extracted from TMP_Sprite.name in UTF32 or UTF16 HEX format separeted by ā€˜-ā€™ for each char (see example below)
    Ex: TMP_Sprite.name = 0023-fe0f-20e3.png

0023-fe0f-20e3.png will generate:
Unicode 00000023 and
Sequence 00000023 0000fe0f 000020e3

so this a valid CharSequence and must be included in SearchEngine because TMP_Text dont know how to handle with it
(the sequence != unicode representation)

But

0023.png will generate:
Unicode 00000023 and
Sequence 00000023

so this will be ignored by the searchengine (default TMP_Text can handle this case without extra overheads because the sequence == unicode representation)

PS1: The pattern is the default names in emoji-date JSONs and in EmojiOne JSONs
PS2: The CharSequence will be ignored if UnicodeHex == Name.ToHex() or if the Name is not in correct pattern.

  • Search engine will try cache all Emoji Sequences of a SpriteAsset in two lookup tables

  • The first table will Map the CharSequence to SpriteIndex
    (example: sequence U+1f3c4 U+200d U+2640 U+fe0f will be mapped to <Sprite=512>)

  • The second Dictionary will map all paths until the end.
    (Example: in Sequence U+1f3c4 U+200d U+2640 U+fe0f we will generate entry for all chars util the end because we need O(1) access while trying to find if a char is mapped as a sequence (or if i can give up and just leave this char in final text)

FastPath Entries for U+1f3c4 U+200d U+2640 U+fe0f

  • U+1f3c4*

  • U+1f3c4 U+200d*

  • U+1f3c4 U+200d U+2640*

  • U+1f3c4 U+200d U+2640 U+fe0f*

  • During parse process it will check char by char, looking at FastPathDictionary. if we failed to find entry in FastPath we try to retrieve sprite from CharSequenceToSpriteIndex table (with current path checked until last iteration). If failed we leave this char alone (this is not an emoji sequence).

  • The process is very eficient because we acess all dictionaries in O(1), so the order of complexity to parse a text is O(N) while N = text.length
    (with minor extra overheads during the search).

  • The SearchEngine will only parse text when something changed in TMP_EmojiTextUGUI

  • The original text in TMP_EmojiTextUGUI will not be affected because the parse process donā€™t save the parsed text in m_text (only in the charbuffer)

FAQ

If the creator of TextMeshPro need any help to integrate this to original TMP_Text,
just send me a message or e-mail.
(raf.csoares@gmail.com or raf.csoares@kyubinteractive.com)

24 Likes

Most excellent, I just updated to 2019.1 and have a project of moderate size that relies heavily on TMP, (Itā€™s a sudoku derivative) and now I need emojis. So I just downloaded it, and going to give it a try. Iā€™ll email you if I have any deep technical issues.

1 Like

Hello! Emojis are small(
Is it possible to scale them?

Just download the version in 64x64 in GitHub - iamcal/emoji-data: Easy to parse data and spritesheets for emoji and follow the tutorial in my post.
But take care, Highend Mobile Devices can only run textures with 4096x4096(in low end devices this number is limited to 2048x2048).

Generated emojis from sheet_apple_32 of emoji data doesnā€™t work with the generated JSON, Iā€™ve noticed that their table uses 0.41em as the padding.

Is there anything i need to change to support the latest emoji data?

Thanks for your asset! Maybe you can do it for TMP InputField too?)

Or I can change TMP text component in InputField to this?

upd: yeah, it works

upd2: no :frowning: input field shows emoji, but when try to edit error occurs

Hey Rafael_CS just wanted to say thank you very much for this! I was wrangling my head around character sequences support! Thanks again

1 Like

Can you give me more details about it?

Change the padding while generating the JSON, you can do it in the Editor Window Tool of my asset

I donā€™t remember now, need reproduce it again.
But I know that fallback atlases support works incorrect, cuz for each fallback atlas tablekey will be rewrited.

I have 2 atlases, atlas 2 in fallback list for atlas 1
Input emoji from atlas 1, it will paste smth like that:

from atlas 1 <sprite=10> text

And now if I input emoji from atlas 2, with index 10 it will paste the same

from atlas 1 <sprite=10> text, from atlas 2 <sprite=10> test

Will show 1 emoji from atlas 1, but should be different. Emoji from atlas 2 must have other index.

I think, need some edits with that part:

string tableKey = tableBuilder.ToString ();
if (!string.IsNullOrEmpty (tableKey) && !lookupTableSequences.ContainsKey (tableKey)) {                             
  lookupTableSequences[tableKey] = j;
}

Cuz in loop for each atlas, j variable will be increment from zero

So, I fixed that issue by adding atlas name support.

from atlas 1 <sprite="atlas1" index=10> text, from atlas 2 <sprite="atlas2" index=10> test
1 Like

Nice, i will update with this modification support

[ChangeLog Version 1.1]

Changed parser to support fallbacks.

<sprite name="char sequence">

Doing this we can prevent index problems and avoid necessity to put spriteatlas inside the correct folder.

Thanks igor for the bug report

1 Like

So, can you test this realization with TMP Input field? Maybe I do smth wrongā€¦

I will try take a look.

But the inputfield in mobile, just send text in input to the tmp_text ā€¦ so, if the inputfield send the sequence of characters to tmp_text, the algorithm should decode and replace to a emoji

I used your generated json file and everything is working fine except the specific icon which consinst of multiple unicode characters (like the flags!)

When using your TMP_EmojiTextUGUI I get the same wrong behaviour as when using the normal TextMeshProUGUI component?

The flags are shown as multiple icons as can be seen here (this is now with EmojiTextUGUI): 5076995--499346--Screenshot 2019-10-17 at 11.43.05.png

It should only show the austria flags (that was what I entered) - so the second flags before are all wrong.

Shouldnā€™t the TMP_EmojiTextUGUI parse this correctly - I was thinking this is the reason for this control or am I wrong?

Thanks in advance for your time and efforts,
Oliver

I can see that your searchengine is correcty changing m_text to the correct sprite name (which is als available in table!) then calling ParseInputText() and then reverting. But in game it still looking with the multi icons?

I am using Unity 2019.2.8f1

When removing the line:

//We must revert the original text because we dont want to permanently change the text
                m_text = v_oldText;

then it works fine? I guess it does not work when temporary changing m_text and calling ParseInputText() in 2019.2.8f1?

I checked it again and found out that emoji pasts correct, but caret works strange after emoji paste/remove. After some text and emoji, caret position works incorrect.

Also when using Multi-Icons, even though displaying works, I can see several warning message in unity from TMP telling that some ASCII cannot be found in fallback value?

Character with ASCII value of 9792 was not found in the Font Asset Glyph Table. It was replaced by a space.
UnityEngine.Debug:LogWarning(Object, Object)
TMPro.TextMeshProUGUI:SetArraySizes(UnicodeChar[]) (at Library/PackageCache/com.unity.textmeshpro@2.0.1/Scripts/Runtime/TMPro_UGUI_Private.cs:1316)
TMPro.TMP_Text:ParseInputText() (at Library/PackageCache/com.unity.textmeshpro@2.0.1/Scripts/Runtime/TMP_Text.cs:1716)
TMPro.TMP_Text:GetPreferredWidth() (at Library/PackageCache/com.unity.textmeshpro@2.0.1/Scripts/Runtime/TMP_Text.cs:3536)
TMPro.TextMeshProUGUI:CalculateLayoutInputHorizontal() (at Library/PackageCache/com.unity.textmeshpro@2.0.1/Scripts/Runtime/TextMeshProUGUI.cs:93)
UnityEngine.Canvas:ForceUpdateCanvases()