From: Thomas Chan
Message: 117
Date: 2000-11-01
> Thomas Chan wrote:Yes, I'm talking about those, but I'd like to make a distinction between:
> > I should first say that I'm not a regular user of post-1950's
> > characters
> > used in the PRC, so my perception of characters and their
> > constituent is
> > based on an understanding of the "traditional" forms.
>
> Are you talking about the so-called "simplified" characters? These are the
> hanzi that I've been taught when I studied Chinese, so I am more familiar
> with them than with the traditional ones.
> My impression is that exactly the same combining rules are used in both
> orthographies. Of course, some components are totally different, and many
> hanzi use a smaller number of components, but the way they combine don't
> change.
> > I suppose it'd depend on what one considers to be a component in moreAre there any cases where multiple components in a "traditional" form have
> > recently created characters, such as some post-1950's PRC
> > creations, as we
> > don't have epigraphy and other sources to consult the
> > entymology of the
> > pieces. e.g., how many components are in dong1 'east'
> > (U+4E1C) or che1
> > che1 'cart' (U+8F66)? Is the right half of han2 'Korea'
> > (U+97E9) one or
> > many components? What about the right halves of zhuan3 'to
> > turn' (U+8F6C)
> > or chuan2 'to transmit' (U+4F20)? If they are made of more than one
> > component, then I'd think they are "glued" together in new ways of
> > assembly--perhaps some kind of overlapping or merger of
> > certain strokes?
>
> U+8F66 (车), strictly corresponding to traditional U+8ECA (è»), is definitely
> an atomic component (it's also a radical in all the dictionaries I have
> seen). BTW, it is also one of the best examples of how "simplification" was
> done: U+8F66 is clearly derived from a strongly cursive version of U+8ECA.
> The other graphemes that you mention should probably be considered as atomic
> components as well. Nevertheless, further analysis of shapes like the right
> part of U+4F20 (ä¼ ) could be possible, in a system that allows for overlaid
> components.
> But I don't understand why you see special combining rules here. Similar odd
> composition exist also in traditional hanzi. See, for instance, some
> compounds of radical U+5F13 (å¼): U+5F14, U+5F17, U+5F1F (å¼, å¼, å¼).
> > On this note, perhaps one might want to try to fit the moreI don't have any primary source materials, but there is a small section in
> > radical (and abortive) "Second Scheme" of PRC simplifications
> > in the late
> > 70's and early 80's into one's system for fanatic completeness.
>
> Interesting; what is that? Do you have any on-line samples? Or can you scan
> printed matter?
> > Is there one for trios? A "macro" of sorts for a pyramid structure--aNo system in particular, although the IDS of Unicode (and GBK) are the
> > dozen or two occur in the AD 100 _Shuowen Jiezi_ (U+8AAA U+6587 U+89E3
> > U+5B57)--would also be handy for some composition schemes (rather than
> > a combination of a "top-to-bottom" and a "left-to-right"
> > operator). Ditto for quads.
> >
> > x y y
> > x x y y
>
> Are you asking about a specific system? Which one?
> Within the Unicode "Ideographic Description Characters", the only
> 3-component IDC's are for side by side juxtaposition and for vertical
> stacking.
> The "pyramid" structure would indeed be a sequence like <TTB x1 LTR x2 x3>If I had to, I would analyze the quad as two rows that had two elements in
> in Unicode IDS (Ideographic Description Sequences).
> The quad structure is interesting, because it can be represented by two
> competing sequences: <TTB LTR y1 y2 LTR y3 y4> vs. <LTR TTB y1 y3 TTB y2
> y4>.
> BTW, the last time I discussed the issue of "Han decomposition" on the
> Unicode List, this fact of quads (and many other structures) having several
> possible analysis was mentioned as one of the big dangers of the whole idea:
> imagine searching your <TTB LTR y1 y2 LTR y3 y4> in a text file, and not
> finding it because it was written as <LTR TTB y1 y3 TTB y2 y4>...
> Adding a specific "QUAD" operator could sound like a solution to this
> problem, but it isn't -- it just adds one more occasion for
> "misspellings"...
> About the "pyramid" structure: have you noticed that, in a lot of cases,At first glance it seems to imply a progression of "multitude" of some
> this structure is used with the *same* component repeated three times? I
> always wondered why this is so common; does anyone have an "etymological"
> explanation for this?