Table of General Standard Chinese Characters

Back in 2013, the Chinese government published a list of essentially all the characters that you might expect to encounter in electronic communication. The Table of General Standard Chinese Characters contains a list of these characters.

This list is comprised of 8105 characters which are divided into 3 groups. The first group contains the most commonly / frequently used 3500 characters. The 2nd group contains 3000 still common but much less frequently used and the last group the remaining 1605 which are considered rare.

Attached to this post are the list of these characters in hex values which you can copy paste in the Font Asset Creator using Unicode Range (Hex) option as seen below.

3711295–306721–Chinese Characters Common 1.txt (17.1 KB)
3711295–306724–Chinese Characters Common 2.txt (14.7 KB)
3711295–306727–Chinese Characters Uncommon.txt (8.02 KB)

8 Likes

Hey, really useful thanks, but how did you paste a 3500 hex code string into the TMP window?

You paste it in the Character Sequence portion of the Font Asset Creator as shown above.

Yeah it says the string is too long

That is an editor error which will go away in Unity 2018.3 (I believe). This error is due to the 65,535 vertices limitation which also affects the editor. The good news is it doesn’t prevent the Font Asset Creator from doing its job.

@Stephan_B Is it possible to get the breakup of Korean Character and japanese like the one you have for chinese above?

I would assume there are similar lists available for Japanese but I am not personally aware of any. This is where it would be nice to get some insight from someone fluent in the language.

In regards to Korean / Hangul, the language is comprised of 11,172 characters.

Having said all of that, once the next release of TMP is available with Dynamic SDF support, managing languages like CJK which have large character sets will get much easier.

Is there an ETA on the next release of TMP? :–)

Im currently working on a project which needs support for all unicode characters, really looking forward to the dynamic SDF support!

Unicode Chart this link has unicode range. We can include based on what character we want in out font asset. Thanks for reply @Stephan_B

Here is the official Unicode chart.

e…I’m Chinese
i write characters into a file, and select “characters from file” option
i think it’s easier to read and use

you can find character files here

(it’s not my repo, i just find it)

ps. i think 20482048 for 3500 characters is not enough,the character seems a little blurred
but 4096 make the file too large… about 17MB,in mobile phone is a problem
(i use 4096
4096 for 3500 characters before, and now i use 4096*4096 for more than 7000 characters)

pss. i still have a question : TMP seems has use different shader between pc and mobile
but at some place i need characters has same effect both pc and mobile (just like tittles)
what should i do ? don’t use advance effect like ‘lighting’ and ‘glow’ ?

6 Likes

Split the characters between a Primary font asset that is 2048 x 2048 and one of more fallback font assets. This way, the Primary can contain the more frequently used characters sampled / using a higher sampling point size and the less frequently used characters contained in any fallback use a lower sampling quality.

When using fallback font assets, the ratio of Sampling Point Size to Padding must be the same but texture size can be different. So for instance, if the Primary has a sampling point size of 80 with padding of 8, then any fallback could have a sampling point size of 60 with padding of 6 or point size of 50 with padding of 5.

Having said that please see the following post / video about the soon to be released Dynamic SDF system which will make this a lot simpler.

The same shaders are used on all platforms. The mobile distance field shaders provide better performance mostly because of their reduced feature set.

Shader selection is never changes based on platform so if you create a Material Preset that uses the mobile distance field shader because you only care to have outline + shadow on the style of text then that same material and shader will be used on all platforms. The text will render exactly the same on all. If you create another material preset where you want to use bevel and similar features, then this shader will also be used for any text using the material preset on all platforms.

The choice of shader is based on the features / visual design you want for the text using the given Material Preset and shader.

Hi, Stephan. I’m trying to generate the Chinese character atlas with Font Asset Creator. I follow the steps above but the generating process often end up with a Fatal Error in GC which will stuck the editor. And I have to kill the editor process with Windows Task Manager. Btw, I’m using the Microsoft YaHei as the Source Font File (the font file is too large to upload).

Do you have any idea about how this happened?

Any respond is appreciate. Thank you.

Environment:
Unity 2018.3.6f1 (64bit)
Text Mesh Pro 1.3.0

Time to update to the latest release of TMP with Dynamic SDF support. This is version 1.4.0-preview.2a for Unity 2018.3.

Please see the following thread / post and be sure to watch the video to know what to expect.

It works for me. Thank you very much.

I am using the latest version. I can’t get to add 3 languages. Korean, Chinese and Arabic. tried as it is written here. Downloaded the file generated, but does not add Chinese letters. what is the problem ??

Does your font file contain those Chinese characters?

Can you provide me with the font file and your text file that contains the list of characters you are trying to add? Once I get those I can test on my end to make sure it behaves as expected.

Good day. has attached.

4392733–399337–ff.txt (32 KB)
4392733–399340–LiberationSans 6SDF.rar (350 KB)

LiberationSans.ttf does not contain any Chinese, Japanese or Korean or Arabic characters which is why these characters do not show up.

You will have to select some other font file that contains characters for the specific languages you want to support. Google fonts is a good source to find such font files.

1 Like

Is it safe to use a main font with Latin characters and fallback fonts for Chinese, Korean, Arabic, Japanese, Russian etc. ? Will this have any negative impact on performance. Thank you!