![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Due to bad decisions made by a Unicode committee decades ago, you need separate fonts for Japanese, Korean, Simplified Chinese, and Traditional Chinese because of code point clashes. These characters may mean roughly the same thing and look roughly the same, but that does not mean they are correct.
(Each TC/SC/J/K font is 5–7 MB. There’s six standard calligraphic systems, never mind weights and handwriting fonts and all that fun graphic design stuff.)
So even though we now have emoji that combine in various ways to form new emoji, we are somehow stuck with downloading separate TC/SC/J/K fonts and then having the wrong character display on a webpage anyway, due to how font fallback works.¹
In the word processing days, typesetters could create new characters by defining a structure and the component parts that go into it. This is very sane, because this is how characters are built, and such a system would handle almost everything except bíangbíang miàn. A lot of characters are generated this way by font foundries, since it’s just impossible to make fonts otherwise. A construction system starts to breaks down once you factor in calligraphic balancing, but ligature support exists for Roman fonts….
The reason I bring this up is that Chinese speakers are having a hell of a time² trying to add new characters to Unicode, because there’s a high barrier of attestation required that doesn’t factor in, for example, concepts that didn’t exist a century ago. (Everyone’s currently making do with “TA” in Roman letters as a gender-neutral pronoun.) Even well-attested 漢字 in non-Mandarin languages first need to be approved, then need to be added to fonts. And it’s just impossible to add even the majority of extant characters.³
But look at all the shiny new emoji being added!
¹ For some reason, web browsers will attempt to faux-italicize CJK characters (and also emoji). These writing systems were not designed for italics! I tried using the CSS font-synthesis
property to turn that off, but it doesn’t work if you’re using a generic font fallback like serif
rather than enumerating every single CJK font under the sun.
² I can’t find the citation for this right now; it was a blog post about a proposal to do with gender-neutral characters within the past few years iirc.
³ I actually ran into a character unsupported by Unicode with a recent pop song I was translating!