Re: African languages in Arabic script?

Bob Hallissy wrote:

> So what we would really like is a mim with a dot underneath for mb and
> something like possibly a nun with a little v above for ny.

Notice that Unicode has combining diacritics, so also modified letters that
are not on the charts may be produced.

Most of these diacritics are allocated in block 0300..036F. See the unicode
chart for them:

http://charts.unicode.org/Web/U0300.html

As you see, both the little "v" above and the dot underneath are included:

030C (COMBINING CARON)
0323 (COMBINING DOT BELOW)

These diacritics are shared for all alphabets so there is no theoretical
reason not to use them for Arabic letters too.

So your two special letters could be encoded with a *pair* of codes each:

0645 0323 (ARABIC LETTER MEEM, COMBINING DOT BELOW)
0646 030C (ARABIC LETTER NOON, COMBINING CARON)

In practice, however, there are a number of problems that must be addressed:

1) Display: fonts the combining in commerce could optimize combining
characters to work well with left-to-right scripts and not for right-to-left
script. So the font must be fine-tuned for this language.

2) Keyboard: if each one of these letters have to be assigned to a single
key, the keyboard driver must be able to fit a sequence of characters in
each slot;

3) Cursor movement: if these have to be handled as single letters, the
cursor must not stop on these two diacritic marks. However, if Arabic vowel
marks are used, it must stop on them (one may wish to delete a certain
vowel).

4) Sorting: the programs that put text in alphabetic order must know that
these sequences of codes count as a single letter, and where these letter
sorts in the alphabet.

However, these are the same problems that Unicode poses for more or less all
languages and all alphabets. So, software developers are coming up with
generalized solutions for these issues.

Particularly, point 1 should be addressable using "smart font" technologies
such as SIL's Graphite.

BTW, a possible alternative for the "ñ" sound, is 0683 (ARABIC LETTER NYEH):
it is a Sindhi letter that should have that sound, AFAIK.

I am curious about some aspects of this orthographic project:

- Which language is it?

- Do you have a list of phonemes, especially vowels?

- Why are digraphs refused?

BTW, Isn't it surprising that computers are so important nowadays that a new
orthography is being developed looking at Unicode charts?

_ Marco