Fools march in where angels fear to tread, and here go some comments by a
fool. :)

For a Bruce Sterling / Viridians-style Attention Conservation Notice, this
is mostly commentary; perhaps the one useful item is a suggestion to try
to contact educated native Tamil speakers to find out what they do.

@@@@@ On Sun, 30 May 2004 20:52:37 -0000, suzmccarth
<suzmccarth@...> wrote:

> I have been wondering how Korean was able to get both alphabetic
> units and syllables encoded in unicode but Tamil and Indic languages
> cannot.

As a strictly unofficial Unicode hobbyist/enthusiast, in my
hopefully-humble opinion it seems that although "politics" (quotes
intentional, and not misused for emphasis) is mimimal in what happens in
the creation and advancement of Unicode, in this instance I wonder, just a
little. From my badly-informed viewpoint, it seems that Unicode (and many
applications that use it) need[s] advocates for the Indic languages.

[As to politics, (and perhaps best-forgotten history?), iirc Tibetan was
deleted (and later re-included), but I might well misunderstand the
reasons; maybe it was *not* pressure from the P.R.C.?]

Before I make a bigger fool of myself, I should say that I think Unicode
has been exemplary in its international cooperation, although I do
understand that in parts of Asia, it's not exactly beloved (Tron comes to
mind, but I don't want to stir up a hornet's nest!).

Roughly around the time I was learning about Unicode and finding it quite
inspiring (the song on the 3.0 book's CD is lovely; I should listen
again!), I was also politely (I hope!) pestering the Opera browser people
to enable Opera's use with other scripts, trying to tell them that BiDi
exists (they must have known), and giving more than hints that rendering
other scripts was not a casual proposition (especially in a cross-platform
browser). Well, after some time, Opera seems to have it right for BiDi,
rendering Arabic and (most of the time) Hebrew well. It seems to do very
well with Chinese (both trad. and simplified), Japanese, and Korean, but...

support for Indic and SE Asian scripts seems still (far?) off in the
future. (Afaik, Thai line breaks don't have a dictionary lookup; not sure.)

In Opera's case, it's a commercial product, and Opera Software judges
where its markets are likely to be. (It does seem mildly surprising that
India is apparently not considered economically significant; maybe I get
it wrong...) Nevertheless, I'm very much an Opera advocate.

> As background, I have an ESL class and have my students keyboarding
> in Chinese, Korean and Tamil. The Chinese students use Pinyin Input
> and it works like dream for them - very satisfying. Korean also
> seems very simple to keyboard for a child.

I'm moderately curious what operating system you're using; Windows 2000?
Mac?

> However, Tamil has been a nightmare. [...]
> Then we tried the Unicode version in WinXP. Well, we get it and can
> use it but it is not esthetically satisfying or straightforward to use.

I'm no M$ fan, nor am I an XP fan.
Lack of advocacy?

> It seems from what I read that Tamil speakers have a strong feeling
> about syllable level representation, certainly as much as Koreans
> do, but they have a very unsatisfactory system. You should just
> watch the difference in the children keyboarding.

Ouch.

I also really wonder whether non-Eurocentric languages need a keyboard of
a quite-different design. Alternative keyboards tend to be designed for
alphabetic input of scripts with roughly the same number of characters as
found in European languages. While IMEs seem to work more or less well for
Asian languages, is there a need for a Pan-Asian keyboard?

Nevertheless, the present keyboard (or one of the alternative designs) has
a key count that seems ergonomically reasonable. I rather doubt that a
keyboard with several hundred character keys would be practical; I'm
reminded of the (slow!) Chinese typewriters that had maybe 3,000 or so
physically-unattached type slugs in a quite-big tray.

> I feel that Indic languages have been disadvantaged by western
> assumptions that they have an "alphabetic" system because of its
> appearance and no one really looks at how the system functions in
> their culture.

As to the Unicode/Tron comparison, although I'm essentially convinced that
Unicode made extreme, extensive, and commendable efforts to create a
long-term encoding for major Asian languages, it seems possible that more
local user and cultural "input" might have been wise. Might there not
also be a somewhat-related experience with the Indic languages/scripts?
However, please see the opening sentence of this message.

India has an impressive number of talented, educated, intelligent people
in the computer field. I'm rather surprised that (it seems that) more work
has not been done (or has it, but just not been publicized?) to make
keyboard entry of at least the principal Indic scripts practical and
culturally congenial.

I wonder whether you might learn about good Tamil input methods (if any
exist!) from educated Tamils in your area.

One factor could be emigration patterns; in my area, it seems that almost
everybody from India comes from Gujarat. (Oh my! I think I know an
educated native Tamil speaker whose store is within walking distance, and
who is in the computer field. No promises...). When I worked in a little
neighborhood computer store, out of maybe 15 Africans who came into the
store while I was there, 12 or so were from Uganda. (It was fun to ask
whether they spoke Buganda!) Nevertheless, I have little doubt that there
must be concentrations of Tamils in some parts of the USA.

While I tend to be conservative about kids and computers, especially
younger ones, nevertheless I'm delighted to learn that they are being
taught to type in ther parents' languages. Doing so is so much more decent
than such crimes as punishing children for speaking their parents' native
languages.

While close to the topic of other-language keyboards, I surely hope that
at least some of them are not laid out to try to map the "qwerty" letter
layout! There's a lot of sociology involved with that layout, and the only
excuse for basing a layout for a different script on it is to make it
easier for a native English speaker/typist to learn the new arrangement. I
believe there's a Turkish letter layout that is radically different from
"qwerty", and iirc one of the Scandinavian languages also has a better
layout. (Yes, I use the Am. Std. Kbd., far better known as the Dvorak
layout, although Dvorak rearranged the numerals.)

As to letter layouts, there were also one developed by C.L. Sholes, one of
the principal inventors of typewriters, another ("dhiatensor") for the
Blickensderfer typewriter, the Linotype layout, alphanumeric sequences
(horrible!), and probably others for historic typewriters.

I remember talking with a native Japanese speaker who was in my city,
working as a contract translator into Japanese for software docs.
He didn't have a choice of input methods, and hated the one he had to use;
IIrc, it was Shift-JIS.

Thanks, folks, and best of luck to "Suz" and her Tamil-speaking youngsters!

--
Nicholas Bodley /*|*\ Waltham, Mass.
Opera 7.5 (3778), using M2