Conjunts

--- In qalam@yahoogroups.com, "suzmccarth" <suzmccarth@...> wrote:

>
> --- In qalam@yahoogroups.com, "suzmccarth" <suzmccarth@...> wrote:

> > I read it but I missed the part that might explain that the virama
> > has a different nominal glyph (that thing that gets stuck on the
> > keyboard) for Devanangari and Tamil.
>
> Never mind - I went back to the code charts after this. It isn't the
> same codepoint but it has the same name. However, different actions
> in different places.
>
> It also has two different
> > actions in Devanagari and Tamil, creating conjunct consonants in
> the
> > one and consonant clusters in the other. This information must be
> > in Uniscribe since it is not in the encoding.

Reading the section 'Left Matras in Malayalam and Tamil' in
http://www.microsoft.com/typography/otfntdev/indicot/shaping.aspx ,
one might think so, but I'm not so sure. Note also that some of the
information resides in the *font*!

Consider the orthographic Devanagari syllable pronounced /nti/. There
seem to be three tolerable ways of writing it in Devanagari,
presenting glyphs from left to right:

A: + <ligature nt>
B: + <half form n> + <ta>
C: <na> + <virama> + + <ta>

Form C can be forced by Unicode sequence 1:

1. <<na>><<virama>><<ZWNJ>><<ta>><>

Form B can be forced by Unicode sequence 2:

2. <<na>><<virama>><<ZWJ>><<ta>><>

Form A is the normal response to the Unicode sequence 3:

3. <<na>><<ta>><>

However, if I understand the Unicode specification correctly, fonts
are allowed to lack ligatures and half forms. In this case:

Sequence 3, neither half form nor ligature available results in Form C.

Sequence 3, half form but not ligature avaialble results in Form B.

Sequence 2, no half-form available, results in Form C.

Now, for modern Tamil there are no half-forms and only one conjunct,
namely <ksha>, so applying the Unicode rules for Devanagari, but with
the Tamil rules for position, a normal Tamil font will force the
adoption of the normal Tamil behaviour, without the rendering agent
having to know anything about the beyond the decomposition of the
vowels into preposed, following, superscript and subscript parts.
(Note that Tamil <<ka>>+<<pulli>>+<<.sa>>+<<e>> yields <e>+<ksha>, not
<ka>+<pulli>+<e>+<.sa>.)

(There are undoubtedly subtleties I have overlooked, but a generic
desciption seems perfectly possible.) Of course, Uniscribe may well
not bother looking for Tamil half-forms (which would be a mistake if
they once existed) or conjuncts other than <ksha> (which optimisation
will then one day draw howls of dismay from someone producing an
archaic font).

Richard.