Peter Constable wrote, on Sunday, March 27, 2005 4:16 AM:
> From: Richard Wordingham [mailto:richard.wordingham@...]

>> Thai needn't be a 'complex script'; you
>> can
>> get typewriter quality (still consider good enough for subtitling on
> UBC,
>> the Thai cable/satellite company - see
>> http://www.thaivisa.com/forum/index.php?showtopic=29663&st=14 )
> without
>> any
>> complex processing.
>
> But typewrite quality is not good enough for very many applications, and
> getting more than typewriter quality takes a modest bit of processing.
> Also, even with typewriter-quality Thai, sorting takes additional
> processing.

The latter doesn't necessarily make it any more 'complex' than, say, Welsh! Just as Welsh 'ng' comes between 'g' and 'h', so each of the combinations of one of the 5 preposed vowels and a consonant could be treated as an extra consonant coming after the plain consonant.

(If viewing this in the Qalam archives, switch the encoding to UTF-8!)

I had a look at the Micrososft Thai Typography page to see what was gained by treating Thai as a complex font. The gains seem to be:
a) Better graphic warning of invalid combinations of marks, IF you are using an open type font.
b) Kerning is done between base glyphs. However, base glyphs are defined as:

'any glyph that can have a diacritic mark above or below it. Layout operations are defined in terms of a base glyph, not a base character, as a ligature may act as the base. ',

As this definition excludes the preceding and following vowel marks, I'm actually having difficulty thinking of any pairs one might want to apply kerning to.

c) Sara am is decomposed, but I'm not sure you couldn't do that as a 'standard' (i.e. non-complex) font. However, it saves on substitution definitions in the font that would be required to render ป้ำ 'strong' with the circle below the tone mark and to the left of the ascender. The effect is as though you had typed po pla, nikkahit, mai tho, sara aa, which the standard Windows XP input method does not allow - I was able to generate it though: ปํ้า

Feature (c) seems to be the biggest gain from treating Thai as a complex font.

Some of the allowed combinations seem bizarre - perhaps it's because they're only needed for minority languages. Does anyone know? Are there instances of thanthakhat appearing above a tonemark? I can't see why anything should be allowed between yamakkan and the consonant below. Is phinthu really allowed with a vowel symbol? The input editor doesn't allow it. Of course, the published description may be wrong - sara i plus nikkahit, as in Sanskrit สึห siṁha 'lion', has to typed in as the traditional equivalent, sara ue.

The Thai reaction to [เล็่น], combining maitaikhu with mai ek, as a purely phonetic way of showing the actual pronunciation of <เล่น>, was effectively, 'You what?'. It's apparently allowed by the Uniscribe shaping engine, but again is forbidden by the input method.

Richard.

[Non-text portions of this message have been removed]