Korean, Japanese and Chinese Localization issues

Hi,

I have downloaded fonts that support the Korean language, I apply the font to my TMP Text and google the translation for, lets say, “hello”.

English: “Hello”
Korean: “여보세요”

However, if I paste the Korean into my TMP Text, the signs are just squares.
If I try to type normally on my keyboard, Korean signs show up, but they stack on top of each other in the Textfield.

How do I solve this?

EDIT: Works fine using normal Unity Text, so something directly related to TMP

You need to either create a font asset and include these characters specifically or create a font asset the is set to Dynamic mode which is a new feature available in version 1.4.0 of TextMesh Pro for Unity 2018.3 or 2.0.0 for Unity 2019.1

See the following post about the Dynamic SDF system as well as this additional video and preview of the fallback and dynamic system.

I have plenty of font assets, I’ve used TMP for a while now. But the asset I generate with Korean characters does not work.I am copy+pasting this into my TMP Text with this font:

여보세요

That just turns into squares. However if I type manually on my keyboard the signs show up, but they stack on top of each other.

The actual .tff work fine in Normal UI Text, just the TMP Asset that does not work properly in TMP Text.

Can you provide a link to the font file?

Most likely the font file is missing those characters in which case UI Text would use some other system font to replace those missing characters.

https://fonts.google.com/specimen/Sunflower

However, I do see that “characters packed” is “100/306”, how do I make sure all chars get packed?

Thank you for providing a reference to the source font file.

As per the image below, the source font file does appear to contain the following characters “여보세요”

However, in your original image, you have the Extended ASCII set selected and as such these Korean characters would be excluded as seen below.

In the latest version of TMP which is version 1.4.0 for Unity 2018.3 or 2.0.0 for Unity 2019.1 which now include the new Dynamic SDF system, the Font Asset Creator now provide more informative feedback which includes the list of Characters Included, Missing and Excluded because they could not fit into the given atlas size based on the settings provided.

My recommendations for handling this, would be to use the latest version of TMP if you are using Unity 2018.3 with the new Dynamic SDF system and to created a static font asset that contains Extended ASCII and then a single fallback that is dynamic which will be assigned to the primary static font asset. This will result in the Korean characters and glyph being added to this dynamic fallback.

For example, the LiberationSans font asset included in the updated TMP Essential Resources with version 1.4.0 of TMP use this setup where the LiberationSans is static but also has a dynamic fallback assigned to it.

2 Likes

Thanks,

Is there any way to do this that does not require 2018.3?
Upgrading from 2018.2 is not an option, broke my project and just spent an entire day restoring it.

EDIT: I watched this video
https://www.youtube.com/watch?v=qzJNIGCFFtY

According to this if I set to auto-size it should just generate an asset with all signs in the provided font, but it’s not? I’m only getting the “standard” alphabetic characters and numbers, no Korean signs. But you said the font I linked does have those characters. So I’m confused. In your image it looks like uve generated an asset that contains the signs.

You need to include the unicode ranges for those other languages. You would define those custom ranges using the Unicode Hex option or you could use a text file that contains all the know text as well.

P.S. Since the new Dynamic SDF system is very powerful and would be very useful / make your life easier here, what part of your project broken when trying to upgrade to 2018.3?

Everything broke, literally could not even open a scene. So I am not going thru that again since I’m almost done with my game. Do you know any good link to a place that has these type of text files or hex-codes stored?

So what am I missing here, because it is not working:

What specifically broke?

Going from 2018.2 to 2018.3 should be pretty smooth. The only issue you would run into with version 1.4.0 of TMP is the need to switch the Scripting Runtime API to .Net 4.x (which is simply to do and will eventually be required as the old .Net 3.5 has been deprecated anyway) but if that was the case, I already reverted that change for version 1.4.1-preview.1 which is available via the package manager.

In terms of a good place to get information about the Unicode ranges used for any given language, Unicode.Org is the best place.

For handling localization, the new Dynamic SDF system makes this process so much easier and if the issue with the upgrade was due to something else, the please submit a bug report with the included project as simply upgrading from 2018.2 to 2018.3 should not blow up your project and is something we should look at. If submitting a bug report is too much trouble I certainly understand, I am simply offering to take a look at this to help you out and invest the time myself to figure out why you ran into such issues. The new Dynamic system is very powerful so just trying to also make your life easier in that respect as well.

I appriciate your help!

Nothing that has to do with TMP broke, every prefab I had was “missing”, and honestly do not have the time to deal with that for a project that is 95% done. TMP is really the only reason I wanted to upgrade anyway.

Looking at my screenshot from my last message, can you see if Im missing something or should that work? Im perfectly happy skipping the dynamic part since I have like a total of 20 rows of text in my game. Spending so much time on something that is 0.1% of my game.

EDIT:

Solved by using another font, no idea how you made it work with that one.

I know I am late to this and surely you have solved the original issue but I wanted to point out a link that may help with future projects moving from an 2018.2 to 2018.3 The broken prefabs are likely caused by a compile error stopping a build. If you fix all compiling errors and re-import the prefabs they should work. If you have no compiling errors I would say delete your manifest.json before updating the scene so it can rebuild with the new settings as needed. And of course always back up your scene before updating to a new version.

Back to the original question of importing oriental fonts I too am confused and have questions about how this is done.

I am trying to add Japanese fonts which is hard as there is Hiragana ( 3040 - 309f), Katakana ( 30a0 - 30ff), Kanji (
Unicode code points regex: [\x3400-\x4DB5\x4E00-\x9FCB\xF900-\xFA6A] Unicode block property regex: \p{Han} …Or at least some of these.) and KJC unified ideograms ( 4e00 - 9faf …Likely not all of these are needed but I am unsure which ones I will need ) as well as Japanese style punctuation ( 3000 - 303f). Some text blocks would also need full width roman characters ( ff00 - ffef)

As you can see that is a lot of Unicode to try and make a character set out of. Now I could take all the translated text I have and put it into one block of text removing any duplicates and then create a font atlas based on that (I have no idea how I would then implement this atlas. Do I past all the characters into TMP Font Asset Creator>Custom Character List? and if so what do I put in the Select Font Asset attribute as my only options are LiberationsSans SDF and Fallback. Whats more how do I add more characters if I need them later? Once made do I still past the kanji into the textmesh pro text field as I cant actually type it ) It seems all of these questions should be resolved already as I am using unity 2019.3.8 with TMP 2.0.1 so I should be able to use the Fallback LiberationsSans SDF Atlas to generate whats needed as seen here

. Trouble is this is not working.

I put in " 前方フェアリング " and get back " _…☐ "

Any ideas or guidance would be appreciated.
Thank you.

This is likely due to your font asset not containing the requested glyphs. What font file are you using?

Please make sure you are using version 2.1.0-preview.8 of the TMP package for Unity 2019.3.

Make sure that you are using a font file that contains those Japanese characters.

Make sure you also create a dynamic font asset using the context menu “Create - TextMeshPro - Font Asset” or by selecting the font file and using “CTRL + SHIFT + F12”.

Depending on the stage of your project, at some point it will likely makes sense to switch some font assets from dynamic to static but while you are developing, just using dynamic font assets with multi atlas texture enabled as well. This is enabled in the Font Asset Inspector - Generation Settings. This is all new functionality that did not exist when the video was created.

Thank you very much. The “CTRL + SHIFT + F12” was very help full in turning KozGoPr6N-Regular into an atlas material the font asset creator then makes roman characters as mentioned earlier and a duplicate font set to dynamic was then able to add the characters as shown in the video link. Thank you so much for your help!

Greatly appreciated.

1 Like

My solution was I reverted back to regular unity text for all my UI, and I used the “Outline” component to add an outline to my text so it showed up against the background. The outline was the only real reason I was using Text Mesh Pro over Unity text, and the regular unity text is working with Chinese, Japanese, Korean, Thai and more by default. It may depend what TMP effects you need, but if you just need an outline then regular Unity text with the outline component works really well:)

Creating and using Dynamic Font Assets and / or combination of Static and Dynamic font assets should ensure everything works as expected provide the font file you select does contain the CJK characters you are trying to display. You can also use the Multi Atlas Texture support feature which will ensure the dynamic font asset can grow as needed to essentially support displaying every single character in that font file which is something the Legacy text system cannot do. No one would ever need to display every single character but if you wanted to, you could.

The Outline component increases the geometry count of the text by 5 X and will impact performance.

P.S. I really need to update these videos to cover all the new features and much simplified workflow which is

(1) Import the CJK font file in Unity.
(2) Select the font file and press CTRL-SHIFT-F12 and voila, you now have a new dynamic font asset.
(3) Optional - Enable multi atlas texture in the font asset inspector and you can now add every single character in that font file and get all the goodies from SDF like Outlines, Shadows, etc.

4 Likes

Thanks:) I could try this out.

You can switch languages dynamically in my game so I need a font file that would handle all of them. With regular unity text no matter which font I choose all the languages work by default. I’m guessing for Chinese, Japanese and Korean characters it falls back to some internal font or something because the CJK characters look different than the font I use in that text field.

Do you know maybe a good free google font or font i could test text mesh pro out with to see if it’d work with a dynamic language switch? I’m in the process of converting all my text back to regular but if I can find a font that works with TMP I can try that out before I go any further

5721010--599524--77B096D3-8B87-4C9F-93FE-A833DB4A6196.jpeg

I have another question.

How is unity regular text able to display characters from every language?

Is it that every major platform - iOS, Android, Mac and pc - contain Asian characters in their system fonts and unity text switches to these when an Asian character is displayed? (So when Unity displays an Asian character not included in a font file it’s sctuslly not using the font itself in that case?)

Is this how unity is able to display so many Asian characters without using a huge font file?

The Noto font family from Google does support every single language. There are many different versions of the font that can be chosen based on needs. Here is a link to this Noto font family. For CJK, I use the Noto CJK font which has good glyph coverage.

The legacy text system ends up using “some” system font available on the given platform. Most platform include a set of fonts with coverage for CJK but this varies greatly between platforms in terms of the fonts themselves and this font selection can also change overtime. Font selection can also vary between devices; mostly on Android.

The ability to use system fonts is nice to save on build size but this also means your application ends up looking different on each platform which for some developer / designer is a big “No No!”. In my opinion, unless you have build size limitations, you should be using your own fonts which gives you ultimate control over the visual design, quality and QA.

In terms of build size, most developer get around this by using AssetBundles, Addressables, etc. where based on language selection, different payloads will be download.

Note that when it comes to localization, most people initially think about text and fonts but localization most often involves other resources like materials, images, animation, audio, colors, scenes, scripts, etc. Besides using different resources based on language selection, object settings and properties may also need to be modified which is why many studio use some type of localization tool to track how language selection affects these resources, their settings and to ultimately manage this process. Fonts are just a part of this overall localization process.

2 Likes

Thanks - I was thinking the system fonts worked something like this, and now I know:)

Ill give the Noto a try with Text Mesh Pro.

Im a solo developer creating open worlds, so minimizing pipeline steps is very important. The only localization needed in my games is in the text, and I want to maintain a single executable so as long as I can get either TMP to work with my csv file in all languages or use the default unity text I should be ok. But its good to know about what AA Indies and AAA developers setup when working with localization so I have a better idea of what can be involved. Thank you.