Peter T. Daniels wrote:
> Michael Everson wrote:
> >
> > At 04:43 -0400 2003-09-15, Tex Texin wrote:
> >
> > >Now that is an interesting comment, and I probably should
> know this, and a
> > >quick look didn't turn up the answer: What is the criteria
> by which Unicode
> > >determines what is in or out of a script?
> >
> > Common sense?
>
> Doesn't look like common sense to me to say that Arabic is a subset of
> Urdu.

I saw no one saying this. Rather, in Unicodish, one would say that the "Urdu
alphabet" is a subset of the "Arabic script".

Note that "script" and "alphabet" are used here with a purely engineerish
meaning:

- The "Arabic script" is the subset the Unicode characters whose "Script"
property (an informative property in the Unicode database) has the value
"Arabic";

- The "Urdu alphabet" is the minimum subset of "Arabic script" characters
which would be included on a keyboard or in a font designed for the Urdu
language.

Both definitions are quite self-referential and technology-specific, and it
would be an error to equate them with the same terms as used in linguistics
or other fields.

> Is Latin a subset of English?

By the above definitions, it is the English alphabet which is a subset of
the Latin script:

- The "Latin script" is the subset the Unicode characters whose "Script"
property has the value "Latin";

- The "English alphabet" is the minimum subset of "Latin script" characters
which would be included on a keyboard or in a font designed for the English
language.


> To write the word corvus, do you write Cee Oh Ar Vee You Ess? No, you
> don't.

How not? What would you reply if someone asks you: "How do you spell
'corvus'?

_ Marco