Re: Cree collation sequence

--- In qalam@yahoogroups.com, Marco Cimarosti <marco.cimarosti@...>
wrote:

> Richard Wordingham wrote:
> > According to the Ethnologue entry for Naskapi at
> > http://www.ethnologue.com/show_language.asp?code=NSK , there are

> > more than 70 speakers literate in the dialect (language by
> > Ethnologue's reckoning). When do you think Google will get round

> > implementing searches that aren't thrown by these variations?
>
> They'll never do that, of course! And I think that the world will

survive

> that: I doubt that 70 people will ever generate enough web content

to make

> it necessary to have Naskapi-language searches on Google...
>
> BTW, I don't think Google has language-specific searches even for a

language

> such as Italian, spoken by 60 millions people. If you want to find

all the

> occurrences of "ob(b)iettivo", you must type "obiettivo OR

obbiettivo".

>
> But Google does have *script*-specific features for, e.g., the

Latin script:

> if you search for "cafe" you'll also find "café", "Cafe", "CAFÉ",

etc.

The accents aren't ignored when you specify pages in the English
language, at least not from www.google.co.uk. They are if you specify
pages in the French language. Of course, 'English doesn't use
accents', so they don't need to be handled. Google might have only a
few selections of what are really script-specific features, but it
may make sense to enable or disable them on the basis of language.

One problem with the language-based selection is that a good-many
pages don't indicate their language.

> And I think that it *could* be possible that something like that

can be put

> in place for Canadian Syllabics too. But someone (Unicode? ICU?)

should

> first publish a language-independent collation of the syllables

where, e.g.,

> the difference between pointed and unpointed syllables is ignored.

What's in the Canadian standard CAN/CSA 2243.4.1? The Unicode
Collation Algorithm (http://www.unicode.org/unicode/reports/tr10)
Version 4.0 does mention it.

The

> Thinking of that, considering that Qalam seems to have a couple of

members

> with a good working knowledge of Canadian Syllabics and languages

which use

> it, such a collation specification could be a nice contribution

from the

> Qalamites to the world...

And the current collation codes can be found at
http://www.unicode.org/Public/UCA/latest/allkeys.txt . There's
nothing sophisticated there for Canadian Aboriginal Syllabics, unless
you count putting QAI and NGai in the right place. However, do the
majority of their user communities regard length and labialisation as
secondary features? There may be even more issues with the vowelless
consonants (typically syllable final). Another issue is that there
appear to be fricativisation and affricatisation diacritics. They
may, however, be no more significant than the stretch and
dent 'diacritics' used to make extra fricative letters in Thai.

> > How did you eliminate YU-W as an alternative to YUU? Word

finally,

> > the examples showed Naskapi -YU-W corresponding to Eastern Cree -

YUU.

>
> I don't know a single word in Naskapi: I just spelled out the

permutations

> that Suzanne mentioned.

Suzanne gave you the following pairs:
E: U+140B U+1426 U+1456 U+1505 U+1431 U+14F2 AA-H-TAA-S-PI-SUU
change clothes
N: U+140A U+1455 U+1505 U+1431 U+14F1 U+1424 A-TA-S-PI-SUW change
clothes

E: U+140B U+1426 U+146F U+14F2 AA-H-KU-SUU sick
N: U+140A U+146F U+14F1 U+1424 A-KU-SU-W sick

My knowledge of Naskapi is as limited as yours.

Richard.