Unicode for Brahmi-derived Scripts (was: Codes for internationaliza

--- In qalam@yahoogroups.com, Andrew Dunbar <hippietrail@...> wrote:

> --- Doug Ewell <dewell@...> wrote:

> Here's an example on-topic computer-related question:
>
> What on earth is the correct way to encode Aung San
> Suu
> Kyi's name in Unicode Burmese as written on
> http://www.dassk.com/

(da + e + aa + virama + ZWNJ 'Daw') a + e + aa + nga + virama + ZWNJ
'Aung'
cha + na + virama + ZWNJ + visarga 'San' ca + u 'Suu'
ka + virama + ra + nnya + virama + ZWNJ 'Kyi'

Not that more obscure than 'edF eAac\ Sn\; su @kv\' in SEAsite's
Myanmar1 font! (More legible in Internet Explorer than notepad, for a
change.) What's the problem? The inconsistent way Unicode handles
bracketing two-part dependent vowel <au>? That the Burmese consonants
are named after their Indian rather than their Burmese pronunciations?

I mention the phonetic, non-Unicode encoding because I don't have any
Widows-compatible Unicode-encoded fonts for Burmese.

Incidentally, is the Burmese 'visarga' really related to the
Devanagari visarga?

> And where can I read the *correct* ways to encode
> exotic scripts in Unicode? Khmer, Burmese, Tibetan,
> Sinhala have always eluded me.

http://www.unicode.org/versions/Unicode4.0.0/ch10.pdf plus the charts,
indexed by script at http://www.unicode.org/charts/ . You might have
to do some digging for the ordering of the various subscripts and
superscripts, but for Burmese there's a clear table.

Richard.