>>>>> Marco Cimarosti <Marco.Cimarosti@...> writes:

> Pierpaolo Bernardi wrote:
>> A secondary benefit of having a canonical decomposition for hanzi is
>> that this decomposition could be used as a base for a radical+strokes
>> based ordering.

> This is possible only if the decomposition and the radical-stroke method
> share the same logic, which is probably quite unlikely.

Yes. A canonical decomposition into hemigrams of hanzi, as characters, would
exist at a more abstract level than the analysis of a specific instance of a
character. So it would be independent of stroke counts. The idea is that as
long as the two components were accounted for, the character could be
represented in any number of ways, even in latin script.

If, for example, the phonetic ZEQ (ceng2, U+66FE), were included as one of the
2000 or so components, 49 digrams (zi4) can be constructed by combining it
with a Kangxi classifier. (These classifiers are usually called 'radicals' in
spite of the fact that in the vast majority of cases they are not the root of
the character and that 'radical' does not actually translate the Chinese and
Japanese term for them which is bu4shou3 (J. bushu).)

Each classifier may be translated in many ways, of course. To emphasize their
usual 'modificative' function within the Chinese character, I have used
Boodberg's adjectives rather than nouns here. In practice these could be
abbreviated; aquatic -> a, human -> h, etc. Thirteen of the 49 possible
characters are listed below. In addition to Boodberg's adjective, Kangxi
classifier is also identified by its number.

zeqaquatic 85

zeqbombycinous 120

zeqcheirological 64 (a classic mistake for zeqterrestial, #32)

zeqdendrological 75

zeqflammeous 86

zeqhuman 9

zeqintimate 61

zeqjaspidoid 96

zeqlapidary 112

zeqmetallic 167

zeqoral 30

zeqquasi-cannaoeous 118

zeqsarcological 130

One question that we have been discussing is whether the order or position of
these two components matters. Can we follow the principles of composition
illustrated above to identify all 50,000+ Chinese characters? Or is it
necessary to introduce a positional operator such that, for example, the
second to the last in the list above would be written quasi-cannaoeous/zeq,
indicating that classifier 118, quasi-cannaoeous-cannoid/bamboo, should appear
above the phonetic zeq? So far on this thread we have identified a few cases
where the position of the hemigrams yields a different character. But it is
not yet clear that these cannot be accommodated without recourse to
positioning indicators. Maybe it can be done by including a full form
(automorphic) or shortened form (brachymorphic) indicator? Or, maybe it should
be done in addition to that. In any event, the introduction of positioning
indicators (such as /) is not too difficult. And the results could still be
read aloud, one of the requirements for such a code, since it would give
access to Chinese characters to those who can not see, but can hear them.

Jon

--
Jon Babcock <jon@...>