On Sat, 19 Mar 2005 19:18:55 -0500, Richard Wordingham
<richard.wordingham@...> wrote:

> Many of the arguments presented are weak.

An informed statement is welcome. Someone like me can be swayed by an
effective persentation...

> While I can't comment on the ISCII,

Unless I'm seriously mistaken, each script included in ISCII has to fit
into fewer than 256 code points. While the creators of ISCII tried to
assign code points for given characters consistently among different
scripts, making at lest some transliteration possible without conversion
tables, apparently ISCII doesn't offer such consistency for all the major
Indic scripts; this comes from that site.

> ... they are clearly not aware of how half-forms and mandatory
> virama are represented (- by ZWJ and ZWNJ).

It's interesting to read your comments saying that they apparently don't
have a thorough knowledge of Unicode. Being a "Unicode fan", I'm glad to
see that the difficulties they describe are either nonexistent or of far
lesser importance. I did come away from the site thinking that the authors
were doing their best to advance the idea of an akshara-based encoding.

One thing I like about Unicode is that (as I see it!) many very fine minds
have done their best to define its structure and details; when I first
learned about it, it was quite inspiring. (The song on the Unicode 3
book's CD is lovely!)

> The discussion on Tamil is almost totally invalidated because of a
> belief that /hoo/ is encoded <<ee>><<h>><<aa>>.

Utterly irrelevant, but seeing that little collection made me think,
irresistibly, of Western U.S. cowboys hollering "hooo-ha!" and "yeeee-ha!"

[ I see doubled "angle brackets" surrounding transiterated letters and
such. I'm wondering why they are doubled; there must be a good reason.]

> In particular, the primary Unicode representations have been designed so
> that context can be *ignored* when searching.

Good to know!

> Is their encoding published? It could get very unwieldy if they have to
> add new combinations.

That's one thing I don't remember from reading; it's more than late, else
I'd go have a look. (Also have a huge backlog of e-mail.)

They did mention plain 16-bit code points, which leaves a *lot* of space!
16 bits gives you 65,536 unique numbers. I would expect them to use one of
the "higher" planes; no room in the BMP, any more, afaik, if they are
thinking thousands.

16-bit code points are, it seems to me, by no means as worrisome as they
would have been a few years ago. I was seriously considering buying a
low-cost tower computer that has a 64-bit data and address path
internally; it had a 64-bit processor. Most current machines are 32-bit
internally; many will be 64-bit machines in a few years.

Btw, it seems that a good amount of their Web site material is a few years
old. By bound book standards, of course, that's often almost of no
consequence, but in a field that's developing rapidly, I was wondering,
just a little bit...

I think I read that the newest Uniscribe(s) do shaping and joining of
Indic scripts.

Many thanks for your comments!

Best regards,

--
Nicholas Bodley
Waltham, Mass.