--- In qalam@yahoogroups.com, "suzmccarth" <suzmccarth@...> wrote:
>
> --- In qalam@yahoogroups.com, "Richard Wordingham"
> <richard.wordingham@...> wrote:
> It is quite conceivable that one person will keep data in pointed
> text and another person will keep data in unpointed text for the
> same language. In reality they both might use some points but not
> always the same ones. Sometimes preaspiration will be represented
> but more likely not.
>
> Wouldn't it be easier to have the vowel length overdot represented
> by one non-spacing codepoint and then that codepoint could be
> excluded for the purposes of data sharing and sorting? Maybe
tables
> have already been created to match up the pointed and unpointed
> syllabics. If so I would love to hear about it.

I take it you mean 'data searching' rather than 'data sharing'. One
wouldn't want to have to manually reenter length and labialisation
marks.

Doesn't look hopeful. Google distinguishes 'role' and 'rĂ´le' in
English, and Unicode-compliant applications ought to treat composed
characters and equivalent sequences identically.

In a bespoke application of course, the issue is little different
from case-insensitive sorting. For DIY applications, dropping the
pre-aspiration is the most difficult aspect. You might get more of
an issue for the long-vowel versus diphthong alternation.

You might like to try contacting Bill Jancewicz or Marguerite
MacKenzie if you haven't already. From their article 'Applied
Computer Technology in Cree and Naskapi Language Programs' at
http://llt.msu.edu/vol6num2/pdf/vol6num2.pdf , it sounds as though
there's a lot of activity devoted to converting between encoding
systems.

Richard.