Andy,

This is good news. The most important reason we all learn Pali I'm sure is to
understand the Pali Canon directly. I'm studying and teaching Pali at the same time,
so your ideas are very welcome.

Do send me your "word counts".

Sukhi.

P.

Andy wrote:

> Hi!
>
> Some good news for beginning Pali students and Pali teachers?
>
> Over the last few months, I have been doing some research into word
> frequency in the Pali Canon. Brother Jim at Aukana in England produced a
> list of unique word *forms* in the Pali Canon on the CSCD and the "count" or
> frequency for each of those unique word forms. (ie dhamma has a count,
> dhammena has a count, etc).
>
> I took that list, sorted it by frequency and discovered something amazing.
> The top 1000 word *forms* (exactly as you see them in the Pali Canon)
> account for 55% of all the words you will see in the Pali Canon.
>
> Oddly enough, many of these words do not appear at all in the free Pali
> courses for beginners.
>
> The Pali Canon on the CSCD has a total of approx. 2,700,000 words.
>
> The total number of unique word forms in the Pali Canon is 152,922.
>
> Total entries in the Paliwords dictionary: 20,119
>
> Here is a summary of the frequency breakdown of word *forms*:
>
> 001 - 100 843,592 occurences (31%!)
> 101 - 200 169,092
> 201 - 300 109,124
> 301 - 400 83,892
> 401 - 500 66,760
> 501 - 600 57,025
> 601 - 700 48,312
> 701 - 800 42,217
> 801 - 900 37,467
> 901 - 1000 33,361
> Top 1000 word forms total: 1,490,842 approx. 55% of all words.
>
> When I began my Pali studies, I became a little bit frustrated. I was
> studying lots of words and grammar, but I still had a lot of trouble
> actually reading the Pali texts. After a while, it occurred to me that the
> beginning Pali courses simply did not contain many of the "most common
> words" and "most common word forms".
>
> Obviously, if you study a word you wish to be certain that the word is used
> very frequently in the Pali Canon. By studying the word *forms* you can
> learn:
> a) important vocabulary
> b) important vocabulary exactly as you will see the word in the Pali Canon
> and
> c) the grammar of that word form in a "meaningful, useful and
> easy-to-remember" context.
>
> Personally, I am firmly convinced that this word study list is the "missing
> link" for beginning Pali students. After all, it's not hard to memorize 1000
> word *forms* and that is 55% of the words *as you see them* in the Pali
> Canon.
>
> Keep in mind: many of these words are *also* used in compound words (and the
> basewords occur using less common declensions). The 55% number does *not*
> include the use of these words in compound words and their use with less
> common declensions!
>
> So that you can get a look at the top 1000 word forms in the word list (and
> so that anyone can use it!), I would like to upload the list to the "Files"
> section as a MS-Works spreadsheet using the "LeedsBit PaliTranslit" font.
> The list has the word form and the count for that word form. Hopefully both
> teachers and students can use this list to "optimize" their Pali course
> work.
>
> I will post it as a spreadsheet so that people can easily sort it by "total
> word form occurences" or alphabetically. The file is 12K zipped and 30K
> unzipped.
>
> Would you like me upload the file? How can I do this?
>
> peace from
>
> Andy
>
> [Non-text portions of this message have been removed]
>
>
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> Yahoo! Groups members can set their delivery options to daily digest or web only.
> [Homepage] http://www.tipitaka.net
> [Send Message] pali@yahoogroups.com
> [Mailing List] http://groups.yahoo.com/group/pali
> [Discussion] http://pub45.ezboard.com/btipitakanetwork
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/