--- In qalam@yahoogroups.com, "Nicholas Bodley" <nbodley@...> wrote:
> On Thu, 24 Mar 2005 05:52:11 -0500, Richard Wordingham
> <richard.wordingham@...> wrote:
>
> > This post is best viewed in UTF-8.
>
> Unfortunately, it looks as though something is mislabeling the
content.
>
> In your full header, sorry to say, I see:
>
> "Content-Type: text/plain; charset=ISO-8859-1"
>
> Opera's e-mail probably believes that; the Thai and Khmer were
rendered in
> Latin-1. As presently set up, View-->Encoding--Unicode: UTF-8
apparently
> doesn't override the Content-Type: specification.

I've been doing some experiments (results at
http://groups.yahoo.com/group/JRW_test/messages ). The conclusion
is that for general character sets, the only general purpose
workable way from a browser window is for both sender and receiver
to manually select UTF-7. Unfortunately, this is not available from
Internet Explorer 6.0, at least not on Windows XP. (It is from
Firefox, but not everyone may use the browser they prefer.)

The sender's setting the encoding to UTF-8 in the browser does not
work. When the browser converts the posting to an e-mail (tagged as
described above), the hex byte values 91, 92, 93 and 94 (Private Use
1, Private Use 2, Set Control State & Cancel Character) are
corrupted. That wipes out 4 characters in every sequence of 64, and
by my calculations will do a lot of damage to Cyrillic and devastate
Canadian Aboriginal Syllabics.

Sending e-mails in UTF-8 does work, but receivers using the browser
must explicitly set the encoding to UTF-8. (It generally stays on
that setting for subsequent Yahoo groups' pages.) If they're going
to use UTF-8, they will have to e-mail any replies.

Sending e-mails in UTF-7 is a good way of making Internet Explorer
users feel excluded. All they can get is mujibake!

Language specific encodings generally work, provided one avoids
anything that gets encoded to 91 to 94. This is why I'd never
noticed the problem with Thai - there is no problem with most Thai
encodings if you stick to ASCII and Thai. MacThai may present a few
problems - many of the values in the range 80 to 9F are used to kern
the tonemarks and superscript vowels. 92, 93 and 94 code for kerned
mai hanakat, maitaikhu and sara i, I'd guess for use with consonants
with right ascenders (po pla, fo fa and fo fan).

Richard.