Re: International Dhamma Research Tool

AG and all,

I was hoping you would jump in here with this knowledge that I know you have
from previous discussions.

Of course I am speaking from the point of view of the straight line
webmaster. Directories, files, and links is about the extent of my
vocabulary.

I think it is important that at least those who are or would be putting
together sutta collections reach some sort of 'standard' for file names and
directory structure. It seems that following the Nikayas present structure
for directories is the most logical. Then I was plugging the notion of a
standard for file naming that would make the file system itself 'readable'.
A person looking at just the file name 'an01_001' knows exactly what sutta
they are dealing with, and with a naming convention such as I am suggesting
the files will appear in order in directory listings.

For just the sutta section I am suggesting:

|-- /dhamma-vinaya/file_structure_template
|
| |-- /dhamma-vinaya/file_structure_template/an
Files numbered consecutively throughout book, but not throughout Nikaya.
| | |-- /dhamma-vinaya/file_structure_template/an/01_ones
| | | FILE NAMES: /an01_001.htm
| | |-- /dhamma-vinaya/file_structure_template/an/02_twos
| | | FILE NAMES: /an02_001.htm
| | |-- /dhamma-vinaya/file_structure_template/an/03_threes
| | | FILE NAMES: /an03_001.htm
| | |-- /dhamma-vinaya/file_structure_template/an/04_fours
| | | FILE NAMES: /an04_001.htm
| | |-- /dhamma-vinaya/file_structure_template/an/05_fives
| | | FILE NAMES: /an05_001.htm
| | |-- /dhamma-vinaya/file_structure_template/an/06_sixes
| | | FILE NAMES: /an06_001.htm
| | |-- /dhamma-vinaya/file_structure_template/an/07_sevens
| | | FILE NAMES: /an07_001.htm
| | |-- /dhamma-vinaya/file_structure_template/an/08_eights
| | | FILE NAMES: /an08_001.htm
| | |-- /dhamma-vinaya/file_structure_template/an/09_nines
| | | FILE NAMES: /an09_001.htm
| | |-- /dhamma-vinaya/file_structure_template/an/10_tens
| | | FILE NAMES: /an10_001.htm
| | |-- /dhamma-vinaya/file_structure_template/an/11_elevens
| | | FILE NAMES: /an011_001.htm
|
| |-- /dhamma-vinaya/file_structure_template/dn
| | FILE NAMES: /dn01.htm
|
| |-- /dhamma-vinaya/file_structure_template/mn
| | FILE NAMES: /mn001.00.htm
|
| |-- /dhamma-vinaya/file_structure_template/sn
| |
| | |-- /dhamma-vinaya/file_structure_template/sn/01_sagv
| | | FILE NAMES: /sn_sagv01.00.htm [nikaya_vaggaSAMYUTTA.sutta skips
sub-vaggas]
| | |-- /dhamma-vinaya/file_structure_template/sn/02_nv
| | | FILE NAMES: /sn_nv01.00.htm [nikaya_vaggaSAMYUTTA.sutta skips
sub-vaggas]
| | |-- /dhamma-vinaya/file_structure_template/sn/03_kv
| | | FILE NAMES: /sn_kv01.00.htm [nikaya_vaggaSAMYUTTA.sutta skips
sub-vaggas]
| | |-- /dhamma-vinaya/file_structure_template/sn/04_salv
| | | FILE NAMES: /sn_salv01.00.htm [nikaya_vaggaSAMYUTTA.sutta skips
sub-vaggas]
| | |-- /dhamma-vinaya/file_structure_template/sn/05_mv
| | | FILE NAMES: /sn_mv01.00.htm [nikaya_vaggaSAMYUTTA.sutta skips
sub-vaggas]

NOTES:
For 'file_structure_template' substitute the source: ati, wp, pts, chinese,
japanese, sinhalese, thai, etc...
In the Samyutta, the 'vagga' abbreviation in the file name could be dropped
as the samyutta numbers are sequential (in the Bhk. Bodhi version put out by
Wisdom) but I like the idea of including it as it is helpful in
recollection.
Sutta divided into several files: dn01.01.htm
File with multiple sequentially numbered suttas: an02_001-100.htm
File with multiple non-sequentially numbered suttas: an02_001_005_033.htm

--------------------------------
As I understand it, one of the goals of people thinking about this (the tech
side, not the Buddhist side) is that content be separated from formatting.
So those who are working on content need some structure that makes sense
without markup and I think that is what I am suggesting. Another principle
the tech side has mentioned is 'graceful degradation'...in other words, in a
situation where the mark-up is too sophisticated for the audience's
equipment, the materials must be made to 'degrade' to their level and still
offer sensible presentation.

One way I think this needs to be thought about is the case of total
breakdown (censorship or prohibitive costs) of the internet as we know it.
In other words, I am back to the CD and making the materials on the CD work
on a relatively simple computer while still providing the software etc., for
setup by advanced users as 'distribution' centers.

Over all, my suggestion in the previous should be seen as a broad and
inclusive and expandable structure for the organization of the various sorts
of materials needed to create a 'stand-alone' research center. I believe the
structure is simple enough to allow for broad re-interpretation by
individual webmasters.
Again, the hope would be that at least the sutta materials would be set up
so that the work done by one group would be easily adopted by another. (I
have just spent three weeks separating a couple of dozen German/Pali/English
suttas that were presented in a table that displayed each language side by
side...in other words, the file itself was impossible to read and madness to
separate into coherent suttas.)

At this time I know what I have in the outline in the previous post can be
done and is extremely helpful in Dhamma Research...this was roughly the
structure of BuddhaDust. So what is needed to carry it into the very
exciting arena of databases and such is:

Someone who knows what they are doing (AG?) working out what needs to be
done.
A set of instructions for webmasters as to how to mark-up their existing
materials.

Just so you know: If you are speaking about adding 10 tags only to every
sutta (and that would not be too helpful, as what is really needed is
'linking' or 'side-by-side' presentation at the section level) in the
collection you are speaking about 17505/84000 suttas depending on who's
counting, so 175050 to 840,000 tags.

As is the straight line link does a remarkable job of connecting things at
all sorts of levels and it is my feeling that if this is a project that is
to become real it will need to begin at that level.

Take Care;
and may your life be long and happy!
Michael Olds

-----Original Message-----
From: Alexander Genaud [mailto:alexgenaud@...]
Sent: Saturday, November 27, 2004 10:00 AM
To: Pali@yahoogroups.com
Subject: [Pali] Re: International Dhamma Research Tool

Michael,

What you are suggesting regarding a file structure is a good
idea. But I caution against considering it a solution. It is
simply a good way to organize files. Exceptions will be the
norm.

I believe it will be more impossible than difficult to harmonize
one-to-one relationships between disparate material. You must
find that this is the case with your own glossology. Sutta to
sutta (more so for detailed granularity) there are already
differing structures with passionate defenders. Even if
recompiled yourself, I do not think it will be possible (or
necessary) to find a file structure compromise.

The hierarchical nature of a file systems is at the root of the
problem. I think you would agree that the pali represents a web
of intertwined cycles and brilliant spaghetti.

A relational database is much closer to the pali than a
hierarchy. However, it too has problems, which I will address
below. A database uniquely identifies each record (sutta for
example). The records themselves are ignorant of the relations
between them. On the other hand, a relation is a separate entity
which does know the relationship between particular records.
There is no limit to the number of relationships and types of
relationships. A record (sutta, word, etc) is related to its
comments, translations, collection, or any number of types. Each
in turn is related to each other. Once defined, relationships
can be determined easily and quickly.

However, even a relational database is flawed. It is inflexible
to re-interpretation and mistakes. It requires maintenance and
each new record increases complexity exponentially (adding new
relationships to and from existing material). They require
centralization and exact structure. These disadvantages are
inappropriate for collecting suttas.

The internet and material found on the web are better models.
However, HTML defines relationships in only one direction (one
page linking to another) and each page fails to create
intelligent relationships (otherwise, there would be no need for
google). It is the impossible responsibility of each web page to
accurately determine its own relationships when its author can
not possibly know everything available.

The semantic web considers these requirements, advantages, and
disadvantages.

http://www.w3.org/2001/sw/

Well defined documents (suttas) and the tools (search engine,
RSS feeds, etc) determine relationships rather than structure.
Documents need be available and know what they are, not where
nor how it relates to other documents.

In the simplest example, this can be achieved by placing
standardized meta tags within an HTML document. These say:

I have a title
I have an author
I have a unique identity
I have a PTS reference
I have a Chinese Cannon reference
I am a commentary
I am based on a PTS reference
I am based on another PTS reference
I am a translation
I am a translation of a VRI reference
etc

RDF is the emerging standard for such ontologies used for
indexing library catalogs, syndicated news networks, and
genealogical research.

http://www.w3schools.com/rdf/rdf_intro.asp
http://dublincore.org/documents/usageguide/
http://www.w3.org/TR/rdf-primer/#intro

=====
<ale></genaud.org>

__________________________________
Do you Yahoo!?
The all-new My Yahoo! - Get yours free!
http://my.yahoo.com

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
[Homepage] http://www.tipitaka.net
[Send Message] pali@yahoogroups.com
Paaliga.na - a community for Pali students
Yahoo! Groups members can set their delivery options to daily digest or web
only.
Yahoo! Groups Links