Re: Replying to UTF-8

From: Richard Wordingham
Message: 67005
Date: 2010-12-30

--- In cybalist@yahoogroups.com, "Torsten" <tgpedersen@...> wrote:
> --- In cybalist@yahoogroups.com, "Richard Wordingham" <richard.wordingham@> wrote:
> >
> > --- In cybalist@yahoogroups.com, "Torsten" <tgpedersen@> wrote:
> > > --- In cybalist@yahoogroups.com, "stlatos" <stlatos@> wrote:
> >
> > > And by the way, would you mind answering UTF-8 postings in UTF-8
> > > so I don't have to clear up the mess you leave?
> >
> > Please provide a tutorial on how to reply in UTF-8 and preserve the
> > thread when one has not received the post via e-mail. Recall that
> > arbitrary UTF-8 cannot be sent via the Yahoo groups web interface.

> 3) Replace unrepresentable characters in your text
> long e (e with macron) -> e:
> long hyphen -> short hyphen (minus)
> capital Greek letters -> small Greek letters

The unrepresentable characters for a Web-interface post are those whose UTF-8 representation contain 0x91, 0x92, 0x93 or 0x94, which includes the following BMP characters:

a) Those with modulo 0x100 values 0x11, 0x12, 0x13, 0x14, 0x51, 0x52, 0x53, 0x54, 0x91, 0x92, 0x93, 0x94, 0xd1, 0xd2, 0xd3 and 0xd4
Thus you lose long e, U+2011 NON-BREAKING HYPHEN and capital alpha, beta, gamma and delta. (Other plain Greek capitals get through.)

b) Ranges U+x440 to U+x53F, for x=1,2,3,4,5,...F

I'd hoped you'd found a way of getting Yahoo to belatedly send one a post.

I've experimented with extracting the original post using the view source option on the web page, dumping it in a text file, and dragging the file to one's e-mail client. It's worked on Ubuntu using Firefox, emacs, the File Browser and Claws (e.g. http://tech.groups.yahoo.com/group/JRW_test/message/24 ), preserving the thread structure, but it's fiddly even when it works. Dragging to one's e-mail client is sometimes prevented, as an anti-virus measure.

> 3) In general, click 'Preview' before clicking 'Send' to check if all characters in the posting can be represented; be aware, however, that the characters mentioned in 2) will show up correctly in preview, but *not* in the posting on the site, so be sure to replace them first in the text (one at a time, use 'Find', ctrl-f)

For anyone who's confused, this is obviously step (4), and '2)' refers to Step 3, the conversion of 'unrepresentable' characters (i.e. trashed by the Yahoo web-interface).

Of course, all these problems would be avoided by sticking to ASCII, as John Vertical reminded us.

Richard.