[Fwd: 14.2034, Disc: Re 'Celtic Found to Have Ancient Roots']

-------- Original Message --------

Subject:	14.2034, Disc: Re 'Celtic Found to Have Ancient Roots'
Date:	Tue, 29 Jul 2003 22:34:12 +0000
From:	LINGUIST List <linguist@...>
Reply-To:	linguist@...
To:	LINGUIST@...

LINGUIST List:  Vol-14-2034. Tue Jul 29 2003. ISSN: 1068-4875.

Subject: 14.2034, Disc: Re 'Celtic Found to Have Ancient Roots'

Moderators: Anthony Aristar, Wayne State U. 
            Helen Dry, Eastern Michigan U. 

Reviews (reviews@...):
	Simin Karimi, U. of Arizona
	Terence Langendoen, U. of Arizona

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Karen Milligan 
===========================================================================
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.
=================================Directory=================================

1)
Date:  Tue, 29 Jul 2003 15:43:35 +0100
From:  Larry Trask 
Subject:  Disc: Re: 'Celtic Found to Have Ancient Roots'

-------------------------------- Message 1 -------------------------------

Date:  Tue, 29 Jul 2003 15:43:35 +0100
From:  Larry Trask 
Subject:  Disc: Re: 'Celtic Found to Have Ancient Roots'

 On Fri Jul 25 2003 (Linguist 14.2012), Paul Purdom wrote:

> Recently Larry Trask wrote an unfavorable review of a PNAS article on
> the date of Celtic branching (Linguist 14.1876).
>
> I don't want to refute the review point by point.
> However, I do want to take major issue with one major point implied by
> the review: the suggestion that people interested in classifying
> languages should ignore the methods used in the article. (This is a
> separate question as to whether they should ignore the conclusions of
> the article.)
>
> The problem of classification can largely be divided into:
> 1. what characteristics to consider when doing the classification
> 2. what algorithms to use to convert the data on the characteristics
> into a classification.
>
> The biologists that work on phylogenetic trees are expert on point 2

We'll see about that.

> and they are quite knowledgable on point 1 when it comes to
> characterists of organisms.

So, biologists know about organisms.  This is not news.  But is it
relevant?

> There is no particular reason to expect
> them to be expert on point 1 when it comes to languages.

I can't believe my eyes.  Professor Purdom is telling us that
biologists need not have the faintest idea what they are talking about
in order to write successfully about language.

Wrong.

Forster and Toth are not trying to write about organisms.  They are
trying to write about language.  But, as I think I demonstrated in my
commentary, they lack even the most elementary knowledge of language.
And this ignorance leads them into all sorts of blunders, some of them
catastrophic.

If you don't know what you're talking about, then you can't do useful
work.  I don't care how many whiz-bang computational algorithms you
have, or how many successful publications in some other field:
ignorance produces nothing but garbage.

> The review
> mainly focused on the defects related to point 1. I believe that much
> of the merit of the paper, however, relates more to point 2.

It will be interesting to see this "merit" laid out in detail.

> It is interesting to note that at one time most biological
> classification was done using complex characters. This work required a
> lot of human effort, and expert knowledge was needed to develop good
> classifications. Only a few characteristics were used for doing the
> classifications. Much useful work was done with that model.
>
> Much of the recent work on biological classification is based on data
> that is basically very simple: DNA sequences. Although expertise is
> still important, it is much less so. Each piece of data does not have
> much information about how the organisms should be classified.
> However, by using large amounts of data (thousands of positions in the
> DNA sequence) combined computer algorithms to do the calculations,
> many interesting classifications have been developed.

No doubt.  But so what?  Forster and Toth are not working with
thousands of data points: they are working with a grand total of 28
items.  And they make no use of computer algorithms: they do
everything by hand.

So, because some *other* biologists have done some interesting work
using entirely different methods, it follows that we should take
Forster and Toth seriously?

I note that Professor Purdom explains that biologists have done useful
work with procedures involving "only a few characteristics", but that
he confesses that "expert knowledge" is needed for success in this
case.  Quite so.  But expert knowledge is precisely what is so
conspicuously lacking in Forster and Toth's case.

> Often these have
> agreed with the older classifications, at times they have replaced
> them, and at times they have shown limitations in the new
> methods. People who are interested in classifing how the older
> languages developed into the current ones should pay some attention to
> these techniques.

As Sally Thomason has already explained, we historical linguists are
eager to get our hands on new techniques for investigating linguistic
prehistory, from whatever source.

But--it's not *our* responsibility to wade through Forster and
Toth's paper in order to see if we can locate any jewels.  Rather, it
is the responsibility of the authors to demonstrate that they have
something valuable to offer us.  And they plainly haven't done that.

Forster and Toth are unable to provide an explicit, objective and
principled procedure for assigning states, and therefore for
converting data into trees.  This is not a minor shortcoming, a mere
detail that can be glossed over in order to concentrate on the
imaginary virtues of the method.  It is an unqualified catastrophe.
The authors state explicitly -- not in the article itself, but in the
associated Website -- that our ordinary procedure of relying upon
cognation is *unacceptable* within their method.  According to them,
their method won't work with cognation.  So what criteria do they
offer us instead?

They toss darts at a dartboard, and that's how they get their
assignments.  Or so we must believe, because they refuse to tell us,
and there is no rhyme or reason to their assignments.

So, I can't agree that biologists are expert on point 2, "what
algorithms to use to convert the data on the characteristics into a
classification".  These authors don't even *have* an algorithm.  They
just make it up as they go along.

Don't believe me?  Then look at the article and describe this putative
"algorithm", step by step.

Quite apart from Forster and Toth's questionable manner of choosing
characters, and quite apart from the many errors in their data, the
authors have provided *no procedure at all* for converting raw data
into trees.

Professor Purdom would have us believe that a valuable method lurks
somewhere in the Forster and Toth paper.  But he hasn't identified
this marvel, and I can't see any trace of it.

Larry Trask
COGS
University of Sussex
Brighton BN1 9QH
UK

larryt@...

---------------------------------------------------------------------------
LINGUIST List: Vol-14-2034

-- 
Mark Hubey
hubeyh@...
http://www.csam.montclair.edu/~hubey