Peter T. Daniels wrote:
> I have a technical/practical problem relating to this thread.
>
> I have the job of typesetting a book whose text incorporates lots of
> diacritics and also passages in a variety of exotic scripts,
> and it has been typed "in Unicode." My computers are pre-Unicode
> (OS 8.6 and OS 9 with Language Kits [or 9.2 if I can find the
> relevant CD-ROM]); my fonts, obviously, are pre-Unicode (the
> ones used in WWS, basically).
>
> Does a Word file being "in Unicode" mean anything at all to
> me?
Yes: that you cannot feed it as-is to your old non-Unicode typesetting
system...
> Unicode doesn't include a glyph
{ Technical hair splitting: Unicode doesn't include *any* glyph, just
"abstract characters". }
> for, for instance, o with underdot and macron and tilde and acute accent
(not
> an impossible combination), does it?
I just checked: it doesn't include it as a single code.
> But I have the ability of typesetting such a character using a
> combination of precomposed and floating items. But how could
> "Unicode" tell FrameMaker (a typesetting program with far
> more control than Word) what to do?
{ More hair splitting... It cannot. Unicode is a standard, not a software
product or technology, so it "speaks" to software developers (humans) not to
software (machines). }
In the simplest cases, combining diacritic marks can be implemented in fonts
simply changing the "advance" of the relevant glyphs. This would work also
partially Unicode-unaware applications to display simple letter-diacritic
combinations.
But, unfortunately, such a naïve approach would not work for a complex case
such the one you mentioned above, where a single base letter has three
diacritics stacked on top of it. Such complex Unicode combinations can only
be rendered by fully Unicode-aware applications using the so-called "smart
font technologies" (such as OpenType or Apple's ATSUI).
> Do I have to throw away my entire collection of exotic fonts
> when I move up to OS X 10.2, because they were made before
> there was a Unicode standard -- and because (in particular)
> the abugida characters (aksharas of Indic scripts, and
> Ethiopic) are done with clever combinations of near-arbitrary
> components?
>
> Or what?
Or what!
Although "Unicoders" may not like it, the old techniques you used before
Unicode can still be used with Unicode too.
We programmers would probably define such techniques "quick-and-dirty
tricks", "font hacking", "pseudo-encoding" but, if what you need is just
paper output, and you need it *now*, and you need to obtain that result
using the equipment and software that you already have, then... You
basically have no other choice.
E.g., I have a Devanagari font, called Shusha, which assigns a Devanagari
glyph to each ASCII character. With that font, I can type the ASCII
characters "makaO-", change the font to Shusha, and see it as the Devanagari
transliteration of "Marco", including its nice repha glyph (encoded as "-")
in its proper place. There is no reason why this old technique should not
work even inside a Unicode document.
Of course, using these techniques poses perspective problems. This approach
may be OK until you only need paper output, but would stop working if, e.g.,
your publisher decides that they also want an on-line edition of your book.
But, again, if this is your only choice *now*, you might accept the
potential burden of having to change part of your work in the odd case of a
future electronic edition.
> Don't I still want to be using PostScript fonts, because high-end
> typesetting machines are driven by PostScript?
In theory, PostScript and Unicode do not exclude each other: the former is a
font technology (so, has to do with glyphs) and the latter is a character
encoding (so, has to do with an abstract representation of text), so they
have different scopes. In practice, I am afraid that PostScript did not
(yet?) develop workable Unicode support.
_ Marco