Re: International Dhamma Research Tool

Michael,

What you are suggesting regarding a file structure is a good
idea. But I caution against considering it a solution. It is
simply a good way to organize files. Exceptions will be the
norm.

I believe it will be more impossible than difficult to harmonize
one-to-one relationships between disparate material. You must
find that this is the case with your own glossology. Sutta to
sutta (more so for detailed granularity) there are already
differing structures with passionate defenders. Even if
recompiled yourself, I do not think it will be possible (or
necessary) to find a file structure compromise.

The hierarchical nature of a file systems is at the root of the
problem. I think you would agree that the pali represents a web
of intertwined cycles and brilliant spaghetti.

A relational database is much closer to the pali than a
hierarchy. However, it too has problems, which I will address
below. A database uniquely identifies each record (sutta for
example). The records themselves are ignorant of the relations
between them. On the other hand, a relation is a separate entity
which does know the relationship between particular records.
There is no limit to the number of relationships and types of
relationships. A record (sutta, word, etc) is related to its
comments, translations, collection, or any number of types. Each
in turn is related to each other. Once defined, relationships
can be determined easily and quickly.

However, even a relational database is flawed. It is inflexible
to re-interpretation and mistakes. It requires maintenance and
each new record increases complexity exponentially (adding new
relationships to and from existing material). They require
centralization and exact structure. These disadvantages are
inappropriate for collecting suttas.

The internet and material found on the web are better models.
However, HTML defines relationships in only one direction (one
page linking to another) and each page fails to create
intelligent relationships (otherwise, there would be no need for
google). It is the impossible responsibility of each web page to
accurately determine its own relationships when its author can
not possibly know everything available.

The semantic web considers these requirements, advantages, and
disadvantages.

http://www.w3.org/2001/sw/

Well defined documents (suttas) and the tools (search engine,
RSS feeds, etc) determine relationships rather than structure.
Documents need be available and know what they are, not where
nor how it relates to other documents.

In the simplest example, this can be achieved by placing
standardized meta tags within an HTML document. These say:

I have a title
I have an author
I have a unique identity
I have a PTS reference
I have a Chinese Cannon reference
I am a commentary
I am based on a PTS reference
I am based on another PTS reference
I am a translation
I am a translation of a VRI reference
etc

RDF is the emerging standard for such ontologies used for
indexing library catalogs, syndicated news networks, and
genealogical research.

http://www.w3schools.com/rdf/rdf_intro.asp
http://dublincore.org/documents/usageguide/
http://www.w3.org/TR/rdf-primer/#intro

=====
<ale></genaud.org>

__________________________________
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com