--- In
qalam@yahoogroups.com, "Nicholas Bodley" <nbodley@...> wrote:
>
> Here's a concise and recent document about the problem; it's
relatively
> well written:
> <http://www.icann.org/announcements/announcement-23feb05.htm>
> Excerpt:
> "Homograph domain name spoofing works by exploiting the visual
> resemblance, or near resemblance of certain characters and symbols."
>
> The thought crossed my mind that font designers might want to
consider, in
> the long run, designs that would not be easily prone to spoofing. I
think
> it's not hard to say why that's a bad idea, in some respects.
Indeed! For _languages_ [sic] that don't have undotted 'i' in their
normal repertoire, aren't, for example, i (U+0069) circumflex and
undotted i (U+0131) circumflex identical?
Some Roman, Greek and Cyrillic capitals are identical, and it makes no
sense to distinguish them. I considered it most improper for the head
of the pure maths department to produce a diagram containing both
capital 'm' and capital mu.
The classic disaster in this field is 'Khmer letter qa'(U+17A2) and
'Khmer independent vowel' (U+17A3), which are identical! If there is
one letter that should be deleted from Unicode, it's probably the
latter. Mind you, given the abuse they heaped on Michael Everson over
the Khmer encoding, I think the Cambodians deserve all the problems
they get. I gather they're now having to lump it now - there's
nothing better on offer! It shouldn't be an Internet problem so long
as registries refuse to accept deprecated letters.
Khmer has some nasty features. The diacritics muusekatoan and treisap
both have an alloglyph that is the same as the dependent vowel sign u,
though the context disambiguates them. Moreover, da and ta have the
same subscript form, which is really that of ta. Medially, the
subscript pronounced /d/ (implosive?) often derives from Pali or
Sanskrit tta, the script cognate of Khmer da, so it's not unreasonable
to require them to be distinguished. (I think confusing speech and
writing made things clearer here!) Unicode says they should be
distinguished according to pronunciation - I've no idea what
Cambodians are actually doing.
I wonder if there were any Coptic domain names last year. Any such
names may have become a dodgy-seeming mixture of Greek and Coptic with
the disunification of Coptic and Greek! Unless I'm missing something,
they're not even 'compatibility equivalent'!
Richard.