TMP: Breaking Chinese Bugs for Traditional ZhuYin ("BoPoMoFo" - Taiwan Locale)

When using TextMeshPro InputFields for traditional Chinese in Taiwan (called “ZhuYin” or “BoPoMoFo” where you use a combination of phonetics and numbers [indicating tone] to make characters), there are some dominant (but random) bugs:

Screenshot:

Screencast:

Referencing the screenshot above:

  • Click in TMPInputField

  • 1st char will be in Chinese (normal)

  • 2nd char will become English (expected: Chinese) – indicated in the example screenshot; notice the number 5, followed by 2 Chinese characters.

  • If you unfocus the TMPInputField (click away) and return, it’ll work for until you submit once – then steps #2 and #3 (above) will be repeated. There are no special OnValueChange() events associated with this TMPInputField. Submitting will only clear.

  • If you continue to focus the TMPInputField and type characters, the 1st char will always be an English character until you unfocus >> refocus (then the bug returns after the 1st line).

  • Anytime you see an English char or number @ char[0], it’s bugging out.

  • Both the TMPInputField and the submitted txt contain the buggy str (which is why it also reflects in the chat, which is also a read-only TMPInputField)

EDIT: Similar post reported in 2018:

Thanks for the feedback / report.

There are a few issues related to IME that have been present for a while now. These are not specific to TMP or even the legacy text system but to IME handling which is handled by the Input team.

I will be talking to the Input team shortly and will be sure to bring up these issues as well.

I’ll post back once I have additional information.

1 Like

Ah, thanks for looking into this!

Hi @Stephan_B , any chance you have a timeline estimate on this? I’m approaching an awkward point where I must soon decide if I should try to find a ghetto workaround or if I should wait for a fix. We’re currently in alpha for Chinese testing. Let me know if you require additional details.

Apologies for the pressure! Just trying to plan :slight_smile:

PS - have any temporary workaround suggestions that would be easy to revert later upon a fix?

I will take a look over the weekend and provide feedback no later than Sunday night.

1 Like

Can you check if you are getting the same behavior in any Editor text field / inspector? Also please see if you get the same behavior in the UI Input Field.

What character sequence / keyboard keys are you typing to reproduce these behavior?

Whenever the input field is un-focused, IME is disabled. However, whenever you re-focus the input field, IME is not re-enabled. Right now I have to use CTRL + Spacebar to re-enable IME.

There are definite issue with the IME input system.

1 Like

I’ll be able to test in a couple days and get back to you~

@Stephan_B Ok, I got my wife to help out (my Chinese is limited, especially Bopomofo [Taiwan, Traditional Chinese] IME):

ㄏㄏ>> c space c space
寮>>xul6
監>>ru0 space
視>>g4
中>>5j/ space
文>>jp6
I tried to type "中文" again but it  change to english automatically, so it showed "5" at the next line
翁>>j/ space
trying to type "中" again, and it worked
trying to type "中" again, it falled and show as "5翁"
(when I type 中, it stared with keyborad "j". )
trying to type "文(jp6)" but it show as "j恩"
how to type 恩>> "p space"
If you need more info or how you want as to type, let me know. Thanks

I’m too deep in a messy project to test anything else, yet, but hope this helps.

I’ll re-test these.

Were you able to test this in a normal text field in the Editor in some inspector?

Seems fine in the inspector, so far. By normal, you mean a TMPro UGUI, opposed to inputfield, right?

To add, I’m also noticing some weird stuff relating to copy+pasting of traditional Chinese characters (possibly any unicode). If I paste some characters inside an editor text field, it’ll show as something random. However, if I ctrl+Z, the entire txt is wiped EXCEPT for what I was originally trying to paste!

To update, what may or may not be related, there’s another at least semi-related issue:

Within the editor, I often cannot copy+paste Unicode values to each other (specifically, to a TMPro UGUI) – without doing something hacky. Sometimes it works, sometimes it doesn’t. However, right now (even after a reboot), it won’t:

  • If I copy unicode from outside of Unity, it works fine.
  • If I copy unicode from within Unity to within Unity (eg, from a ScriptableObj to a UGUI text field), this is what happens:

  • ^ What’s happening above is me trying to copy+paste the Chinese version of “Sheriff” ( 警長 ) from a ScriptableObject to a TMPro UGUI Text field: When I paste the value, the unicode is replaced with “fw”. You can see me copy+paste “asdf” no problem (ASCII English).
  • If I copy FROM Unity to OUTSIDE Unity (eg, notepad), the result is also “fw”.
  • If I try to copy a different Unicode from within Unity to Unity, such as “Marshal” ( 刑事官 ), nothing is pasted at all when: In fact, right clicking won’t even show a “paste”:

It seems that issues with Unicode extend to the Editor – but it’s not entirely clear what’s causing it (I don’t have enough time to check patterns further, unfortunately)

EDIT: To rule out my ScriptableObject having issues, I’ll repro this breaking FROM tmpro ugui to tmpro ugui:

  1. copy+paste FROM Notepad TO TMPro UGUI.
  2. Copy from TMPro UGUI to TMPro UGUI. It will not work.

HOWEVER, notice at the end i got it to show – sort of: If I CTRL+Z, the field wipes and is replaced with what was meant to be pasted. However, this means I cannot paste within the middle and CTRL+Z or any accompanied text will wipe EXCEPT what was supposed to paste.

So pasting Unicode from plaintext Notepad to a Unity TMPro UGUI Text field will permanently do something very strange to the text. I can no longer copy+paste it anywhere:

I cannot paste Unicode vals from within a TMProUGUI Text field anywhere within Unity – and even externally. Pasting here or back to notepad results in an empty paste.

Thank you for providing all this great feedback.

I am slowly going through all of these. Some of these issues have been in Unity since Unity 4.x so tracking those with knowledge of those implementation or getting up to speed on my end on those IME related implementations is slow.

Please keep on posting in this thread as you uncover additional issues.

1 Like

When I try to paste from Unity to Unity:

儲存至桌面為

The following is, instead, displayed:

2X

If I paste here, here’s what’s displayed:

2X (Notice the +secret tofu box)

EDIT: Tofu isn’t showing. Here’s what I see:

By 2X, you mean duplication of the composition sequence? If that is the case, I did resolve that issue which has been in Unity since at least 5.x.

P.S. I am focusing on the TMP Input Field issues first and not necessarily Editor / Inspector issues. However, in the case of the duplication issue, I did fix that one in Unity so it is also fixed in the Editor. Editor / Inspector specific issues, I will be passing to the Editor team.

Oops, no, just a literal string. I wonder if it would partially match a /unicode sequence. I’m not good with that, though, so no idea.

Just wanted to do a quick follow-up: Any luck?

Hi, I had already filed a bug report regarding this issue one year ago. It was a bug that had existed at least from Unity 5.
The support told me that they were working on the new input system, however, after checking the new input system that is currently in alpha, I don’t think this new feature is relevant to this issue. This issue is about Unity’s mishandling of IME, and the build-in input field UI and the editor input field are using the same system, they both exhibit the same bug.
I am developing a Chinese word game, and most of my players are using Microsoft Bopomofo to type Chinese. This bug is extremely crucial for me. I am still waiting for the official fix to this issue.
This was the thread that I posted one year ago:

Edit: I have just seen that Stephan_B said that he had resolved that problem. He was able to reproduce the issue when we discussed on my thread. So glad to hear that this issue would be fixed sometime soon. (although it has been another 4 months since the end of this thread…)

1 Like

We’re still getting this, too :frowning: it’s become critical now. There are so many weird bugs. Just try switching to Chinese IME and type – both Taiwan Bopomofo and PinYing both have some serious, serious issues.

Unity is the ONLY game engine used in Taiwan (many don’t even know about Unreal at all) – hopefully this becomes a priority soon.

EDIT: Just saw the edit above; that’s fantastic news - where can I find this fix?

Just wanted to follow up - any progress on this? Or if it’s fixed, is it already on the current preview?

*Ignore attached image - can’t seem to remove it

ok, to follow up - I tried preview 4. Seems like this MAY have resolved the major bugs in the editor, but haven’t built yet or done any testing with our actual Chinese community. Just initially. HOWEVER…

Typing in Unicode will, for some reason, prefix and suffix and , respectively:

5445837--555510--upload_2020-2-5_20-34-2.png

You can repro this by typing the letter “x” with Chinese IME (any form - PinYin or Bobopomofo) and as the suggestions popup, you’ll see the {pendingTypedMsg}. You can see it more clearly if you press backspace once:

5445837--555513--upload_2020-2-5_20-35-27.png

I’m going through my code to make sure I don’t have something REALLY WEIRD lingering from an old dev, but I really think it’s not me.

EDIT: Upon testing with OnValueChanged for the TMP_Input, I discovered that OnValueChanged doesn’t trigger at all when the appears.

EDIT 2: Hmm, can’t repro this minimally. Seeing what’s different. It’s strange because OnValueChanged doesn’t even trigger when I’m doing this, so where could this possible happen? It feels like a hidden style to wrap around it like that.

EDIT 3: I noticed that if you alt tab while you’re 1/2 way selecting a character, the placeholder will show, stacking on top (repro’d minimally)

EDIT 4: Waaaitttt a second… are you noticing the underlined portion? I bet I can repro this by disabling rich text!!