Dear Frank and Jon,

thanks for the information, Frank.

My current understanding of TEI is still limited, and I would be
spending some time going through them as part of the work I plan for
PDF. Still, I had a look at DTD of the Lite version and there are at
least a few hundreds of elements and attributes, which already sent my
head spinning. ;-) I can understand why your markup for CST4 is "very
lite".

TEI is a very generalised scheme which allows a very high level of
customisation. The names it uses for the elements and attributes are
also generalised, and tend towards verbosity.

Adopting TEI increases the chances of interchangeability with other
similarly encoded texts. However, TEI may be overly complex for the
technically challenged.

I would prefer to use a non-TEI native scheme for Pali texts, and then
employ tools to convert to TEI as and when needed. A specially
developed scheme for Pali texts can use names that are more natural
than those from TEI. An important aspect, the metadata, may also be
better captured with a schema specific to the Pali texts. It is
probably XML, not TEI, that we seek to apply to the Pali texts.

I also intend to build in some degree of knowledge management, instead
of simply text markup, into the proposed schema(s). However, I would
be happy to learn what you have already done with CST4, and
collaborate with you in those aspects where we see advantages to have
common definitions.

metta,
Yong Peng.


--- In Pali@yahoogroups.com, Frank Snow wrote:

The VRI texts included with CST4
(http://www.tipitaka.org/cst/installation/ ) are in Text Encoding
Initiative format. I used TEI Lite as a starting point:
http://www.tei-c.org/Guidelines/Customization/Lite/ . Our current
markup could be called "very lite".