Dear Lennart,

Thank you very much. This is very interesting. Do you think the inflected forms other than the nominative can also tell us something?

By the way, I was looking for some works/books that mention particular suttas that are present in some versions of the Pali canon but absent from the other versions. Have you heard of any work of this kind? Appreciate your kind help.

With metta,
Chanida



--- In Pali@yahoogroups.com, Lennart Lopin <novalis78@...> wrote:
>
> Hi Chanida,
>
> Below a copy of an email I posted earlier this year to the Pali group (did
> not find the link on the list)
> It might help you to a certain extant, but is purely statistical and just a
> rough approximation - not between suttas individually but on the level of
> books.
>
> ===
> ... As you all know the "Tipitaka" consists of various text strata. This is
> very obvious of course to anyone reading and comparing the vocabulary, style
> and grammatical expressions used in the Vinaya, Sutta and Abhidhamma texts.
> Prof. Kingsbury did a statistical analysis on this a couple of years ago (see
> here<http://docs.google.com/viewer?a=v&q=cache:YQUw8L4F9WsJ:www.ling.upenn.edu/~kingsbur/inducing.pdf+paul+kingsbury+pali+university+penn&hl=en&gl=us&pid=bl&srcid=ADGEEShpmHsMa8-J_hM6MWDnj1M4JUtuOKd-jORCS-P_zQv0l2PnbgmXEz3CjSBpgz8gMpnlu5W3bi9H6Gq8tr94h6j4RnjmxjJxy34y3hqmjwecS50s97iUa4TFL2sPGhp_VFx5q7vh&sig=AHIEtbQoWszD1QN49uo-4RNw676XN4Apvg>
> ).
>
> So, whenever someone uses CST4 or similar tools for searching and comparing
> text snippets one can see that certain expressions always seem to surface in
> certain books while others would contain not a single entry for that
> particular word or phrase (take for instance "sabhāv*" - you won't find it
> in the 4 Nikāya (for obvious reasons), but already the Milinda mentions it,
> etc.)
>
> So, while Prof. Kingsbury's approach was very straightforward (but complex),
> it only covered a small portion of available books and only categorized
> those few into three basic categories (early, middle, late text strata).
>
> Taking a much simpler approach I created the following report which you can
> download (see link below). What I was interested in was to map out,
> automatically, the relationship (in percentages) between all canonical and
> post-canonical books based on their similarities.
>
> Based on that idea I wrote a little program which extracted a-declension
> nominative forms as indicators of a certain semantic proximity (text-chain)
> from all 217 books (VRI Tipitaka edition) and compared them against each
> other (> 47089 combinations).
>
> I sorted the resulting table by percentage and uploaded it as well (see
> below). Of course the results are crude as we are just comparing one
> characteristic (nom. sing. a-decl). However, because this test is applied to
> the entire range of texts we can still use the percentages as a simple
> indicator of proximity. The closer a percentage between two books the more
> vocabulary they share. This is esp. interesting when we compare the
> relationship between multiple books. One could play around with this even
> more, comparing other grammatical features and then overlaying those
> percentages to arrive at an even stronger indicator of the relationship
> between the various books.
>
> However, for my purposes, this first run (took 2 hours to complete) was
> already more than enough. I guess there is tons of information especially
> for those among you who are lexicographers etc. and you are welcome to
> re-use etc. the source code which I uploaded as well.
>
> But it is quite interesting to see which books form groups in terms of their
> "semantic" (vocabulary) proximity. For instance you will see that the 4
> Nikaya share a great percentage in similarity as expected. We can also see
> that parts of the AN match the Puggalapannatti or observe the closeness
> between Nettipakarana and Petakopadesa. From here we can go through the list
> and discover interesting relationships which may have been not that obvious.
>
> So this might help some of you find the "next best book" to read / study.
>
> Download the report here:
>
> http://www.nibbanam.com/pali_language_tools.html#pprox
>
>
> mettāya,
>
> Lennart
>
>
>
> On Fri, Nov 12, 2010 at 3:09 AM, Poe <jchanida@...> wrote:
>
> >
> >
> > Dear friends,
> >
> > May I ask for help, please?
> >
> > I am looking for books or research papers that compare Pali suttas in
> > different versions of the Pali canon, such as the PTS, Siamese, Burmese and
> > Sinhalese versions. I myself know only some works that compare the Pali
> > canon with the Chinese Agamas or Gandhari texts, but not a comparison among
> > versions of the Pali canon. Would very much appreciate your kind
> > suggestions. Comparison of Pali canon, either part or whole, is of interest.
> > Thank you very much in advance and looking forward to your reply.
> >
> > With metta,
> > Chanida
> >
>
>
> [Non-text portions of this message have been removed]
>