Peter T. Daniels scripsit:

> A guess without looking at other answers yet: treating all the Sinitic
> character sets separately?

Unicode does not treat them separately. However, it has encoded too many
glyph variants separately: luckily, one can ignore most of the names of
long-dead horses and spelling errors that got entered in dictionaries anyway.

--
"Kill Gorgûn! Kill orc-folk! John Cowan
No other words please Wild Men. jcowan@...
Drive away bad air and darkness http://www.reutershealth.com
with bright iron!" --Ghân-buri-Ghân http://www.ccil.org/~cowan