Dear Nina and friends,

pardon me that I have been quite slow in replying recently. I will
take some time to clear up the backlog and outstanding tasks,
including clearing up the files in the Files section to allow more
uploads.

Thanks for your question. It gives me the chance to organise my
thoughts on the topic and put it in words.

Writing has always been a good way of transmitting knowledge and
recording information. Manually, we are able to write letters and
scripts of all kinds and sorts. The problem comes when we want to do
the same with computers and printers.

All inventions start in a simple way. Computers were first built only
to recognise a small set of characters (128 to be precise). This is
partly due to technological limitations. But, it is also due to
meeting practical requirements. The Western European languages have
fewer than 128 characters in total. So, we got the mechanical
typewriter (<50 characters), then electronic typewriter, then the
computer.

As the usage of computer increase, and people need to communicate
electronically with other languages, various solutions emerged. For
the Pali Roman script, there were two prevailing solutions. (1)
Velthuis encoding, and (2) using the expanded character set[1]
(another 128 characters, total 256). So, we see some communication in
Velthuis (as with this mailing list). Some would use Pali fonts for
their documents.

The problem with Pali fonts is most of them are only Windows-based,
and documents typeset in these fonts are rendered illegible in
Macintosh and Unix systems.

Unicode resolves this problem by going beyond the 256 characters to
have a character map of thousands of characters. The latest Unicode
version 4.0.0 contains 96,382 characters. Just the CJK (or Han)
subset alone contains 70,207 ideographical characters defined by
national and industry standards of China, Japan, Korea, Taiwan,
Vietnam and Singapore[2]. In fact, it is impossible to imagine the
Chinese inventing the typewriter.

HTML 4.0 uses ISO 10646/Unicode as its official character set[3]. New
browsers support Unicode and most of them use the UTF8 mapping method
[4]; UTF8 has the potential to take up to 4294 million characters.

[1] http://www.htmlhelp.org/reference/charset/
[2] http://www.unicode.org/versions/Unicode4.0.0/ch01.pdf
[3] http://en.wikipedia.org/wiki/Unicode_and_HTML
[4] http://en.wikipedia.org/wiki/UTF-8

Some information on Unicode:
http://www.alanwood.net/unicode/browsers.html
http://en.wikipedia.org/wiki/Unicode
http://www.unicode.org/unicode/history/

To test whether your computer supports Unicode Pali Roman characters,
check out this page:
http://www.tipitaka.net/forge/unicode.htm


Hope that helps.


metta,
Yong Peng.


--- In Pali@yahoogroups.com, Nina van Gorkom wrote:
I did not follow the thread about Unicode, because I have an iMac. I
am building on Velthuis, what should I do? Is Unicode suitable for
iMac and how can I learn this?

> With the advent of Unicode, Velthuis may be forgotten soon.