> >By the way, having jettisoned seven of their 35 characters, the
> >authors announce that they have 29 left. This is a trivial point, of
> >course, but it does nothing to intill confidence in the care and
> >attentiveness of the authors.

H.M. Hubey wrote:
> Throwing out bad data is sanctified, AFAIK, in stats. Let's ask Richard.

The authors appear to be saying that 35 - 7 = 29. I'd sooner believe 6 * 9
= 42.

Throwing out 'bad data' dispels confidence. One's suppposed to decide what
analysis and then look at the data, because of the dictum that 'every set of
data is peculiar'. Of course, that's far easier said than done. Then
there's the infamous exam question:

'N shots are fired at a circular target, and the positions of impacts on
that target are recorded. How does one estimate the parameters of the miss
distribution? (You may assume that the the horizontal and vertical
components are independent, normally distributed with zero mean and with
equal variances.) Now, if none of the shots hit the target, there would be
a court martial instead of a statistical analysis. How does this affect the
estimates? '

It's infamous because there is no agreement on the correct answer to the
second part.

It's relevant because misses (arguably 'bad data') do affect the estimation
of the standard deviations.
>
>
> >Only one tree is drawn. There is no searching of tree space, and so
> >this is not a "best tree" method.
> >
>
> Why would they do that?

Common practice! Geneticists tend to look for the tree which requires the
fewest changes. The robustness of such trees I think is another matter, but
its a way of producing a defensible cladogram when the branching is not
obvious. As different weightings can produce different results, it doesn't
necessarily help, but if you use a complicated enough program you can apply
Lucifer's GIGO principle (Garbage In, Gospel Out).

Richard.