On Sat, 28 Oct 2000, Jim Breen wrote:

> [Jon Babcock (Re: CJK combining components (was RE: "Giga ...)) writes:]
> >> >>>>> Jim Breen <jwb@...> writes:
> >> > As I mentioned before, my main interest in the decomposition technique
> >> > is as an aid for dictionary searching. If someone were to come up with a
> >> > useful system, I'd like to be able to use it to extend my coverage
> >> > beyond the 6,355 kanji in JIS X 0208.
> >>
> >> It would seem that any new system would not help in using the existing
> >> dictionaries. Or, do you mean to extend the coverage of the dictionaries you
> >> are making and provide an alternate system of organizing the kanji?
>
> Well, au contraire, it can lead directly into several dictionaries. For example, my
> dictionary files of the "JIS" kanji (6,355 + 5,801) have the indices for about half
> a dozen dictionaries and instructional books, including the Daikanwajiten. Thus with
> appropriate software you can use the fragments to identify an obscure kanji, and
> have immediate pointers to it the dictionaries.

I suppose "instructional books" says it all, but is such a system intended
for less advanced users? I occasionally use the "multi-radical" utility
in NJStar Communicator while typing, which seems similar to your system,
but I despair when I have to specify a character by chopping it up into
too many components, and not being able to specify basic positioning
information. As most characters (97% is one figure I've seen before--I
believe at the time of the Han dynasty U+6F22) are of the
signific-phonetic variety, it seems more intuitive (at least from a
Chinese language background) to describe a character like qun2 'dress'
(U+88D9) as signific yi1 'clothing' (U+8863) and phonetic jun1
'lord' (U+541B), rather than chop the right half into 'mouth' and
other pieces.

Also, I'd like to ask how effective a decompposition system has been for
the ~6300 kanji of JIS X 0208--does it break down and yield too many
matches when the pool of characters is too large, like the SKIP system?
If so, what is this threshold? (I wonder how the 4 Corner system holds
up, as it is used for the near-50,000 characters in the _Dai Kanwa
Jiten_.)


Thomas CHan
tc31@...