Re: PIE-Arabic Correspondences (was Brugmann's Law)

From: Piotr Gasiorowski
Message: 51653
Date: 2008-01-20

On 2008-01-20 21:17, Brian M. Scott wrote:

> Piotr has already pointed out that the 10,000-year time
> limit is a straw man, and that serious historical linguists
> have attempted long-range work. I would add that it is none
> the less clear that evidence of linguistic relationships
> will eventually be swamped by the noise introduced by random
> changes. And on the evidence to date, this noise
> accumulates more than fast enough to make any attempt to
> reconstruct a 'proto-world' language an exercise in
> crackpottery. Even just securely identifying a few odd
> traces of one is highly unlikely: even if such traces still
> exist, odds are that it's impossible to distinguish them
> from false positives.

As a matter of fact, they are likely to exist. In a recent article in
Nature (11 Oct 2007, "Frequency of word-use predicts rates of lexical
evolution throughout IE history"), Pagel, Atkinson and Meade estimate
that "some words [those which are used most frequently -- P.] evolve
slowly enough to allow homologous lexical forms to persist for tens of
thousands of years. These slow rates demonstrate that humans are capable
of producing a culturally transmitted replicator that, perhaps because
of the purifying force of of spokem word frequency, can have a
replication accuracy as high as that of some genes. Along with continued
efforts at identifying cognate words separated by thousands of years of
sound change, this raises the possibility of using selected lexical
items to evaluate hypothesized 'long-range' linguistic relationships
such as Eurasiatic and Nostratic."

The data and arguments presented in the article are sound and well worth
a read, though the conclusion is only partly justified, IMHO. What is
stable in the the long run is the association of a given high-frequency
etymon with a given meaning, making lexical replacement less likely. But
the "replication accuracy" of frequently used words is fact _lower_ than
that of rarely used ones -- a fact they don't discuss or even notice.
Such words live longer, but may tend to develop irregularly. Suffice it
to compare English <one> (+ its mutant sisters <a, an, 'un> + the
regular development in <only>) with "decent" members of the same OE
lexical set, like <ghost, boat, road, stone, loaf> etc.

The operative phrase in the quotation above is "selected lexical items".
What the authors find is that the frequency of word-use is
cross-linguistically so skewed that just a handful of words account for
most of speech, while all the rest are, on the average, used less
frequently than once per 10000 words. They demonstrate that there is,
accordingly, an enormous variation in rates of replacement even among
the "fundamental" meanings in Swadesh's 200-item list (their estimated
half-lives varying from about 750 to 10000 yrs). So the "selected" ones
are words with meanings like "two", "night", "water" or "die" rather
than "dirty" or "stab" (not to mention non-fundamental vocabulary). It's
therefore hardly surprising that long-rangers often find something
suggestive -- and possibly non-illusory -- like the "mi/ti" pronouns but
little evidence of anything else. The detected affinities may well be
real but insufficient to establish a plausible reconstruction based on
systematic correspondences.

On the other hand, the findings of Pagel et al. at least suggest a
heuristic for long-rangers: we should initially concentrate on a small
set of high-frequency meanings and try to find a secure foothold there
before attempting further steps. With a little bit of luck we may get
somewhere. But if we cast the net widely among rarer words, we will
almost certainly end up with a useless jumble of fortuitous false
matches, signifying nothing.

Piotr