--- In
qalam@yahoogroups.com, "suzmccarth" <suzmccarth@...> wrote:
>
> --- In qalam@yahoogroups.com, "Richard Wordingham"
> <richard.wordingham@...> wrote:
>
> Clearly you'd need to google in
> > Cree - but it's not an option and there may not even be a
language
> > code for Cree!
>
> Tamil is not an option either in google language tools but I just
> paste Tamil into the basic google search and it works. I don't
> usually bother with the language tools unless I want a word that is
> spelled the same in French and English and I wish to restrict the
> search.
Tamil is a simple language as far as character comparison is
concerned - there are no diacritics that you may wish to suppress
when searching.
The point, which Marco touches on, is that what we may find more
convenient is an user-selectable diacritic suppression facility. Of
course, there may be disagreement as to what a diacritic is. Vowel
marks in Brahmi-derived scripts are probably not to be regarded as
diacritics, but Thai tone and shortness marks should be, I think.
The reasons for wanting to ignore Thai tone marks are that:
a) In loan words from english, they are generally optional and often
inconsistent - the word from English 'cake' occurs both with mai tho
and with mai tri; the word from English 'visa' appears with and
without a tone mark.
b)Trying to work out a native Thai place name from its official
romanisation is a nightmare. Every little helps. If you can ignore
tone marks, you may be lucky enough only to have to consider four
(sometimes only two!) likely possibilities for each syllable.
> However Cree will not display in google yet.
Where's the problem? Isn't Cree encoded as UTF-8 or UTF-16?
Richard.