----- Original Message -----
> No, Michal. These are cladistic
tree (at least they was generated by
> cladistic phylogenic
software).
Sorry, but what you posted is in no
way a classical cladistic tree. When you
draw a cladistic tree it cannot be
based on arbitraly selected similarities
between analysed groups, but
exclusively on the position of their common
ancestors. No classical
cladistic tree can be formed, if the groups you try to correlate include members
that are more closely related (phylogeneticaly) to some members of other groups
than to themselves, as is obviously the case when you try to compare selected
geographical regions (these are groups) containing different Y chromosome
haplotypes (these are members). I hope you agree with me that it is
not true that every Y chromosome from Europe is more closely related
to any other European Y chromosome than it is to any other Y chromosome from
other regions. Let me explain it using a very simple example. Let's compare
two trees shown below.
\
\
/\
/ \
/ /\
/
/ \
/
/ \
/ / \
fish birds mammals
\
\
/\
/ \
/ /\
/
/ \
/
/ \
/ / \
shark herring human
The first tree looks quite OK, as we all
probably agree that fish, birds and mammals have common ancestors (so they
are related). In fact, this is the most popular way people show the evolution of
vertebartes (of course, the tree shows only a part of it). However, this
can't be called a classical cladistic tree, since some fish (or even most of
them) are more closely related (have closer ancestors) with mammals
and birds than with other fish. The second tree is a typical cladistic tree
- maybe it looks a little bit strange to a nonbiologist, but it
reflects in a more precise way the evolution of fish and mammals (or rather of
shark, herring and human). In this case we can say that EVERY herring is
more closely related with EVERY mammal than it is with ANY shark.
A typical cladistic tree is also shown on
fig. 1 of Underhill's work, and this figure should be a basis for reconstruction
of the evolution of human Y chromosomes (or the evolution of humans, and maybe
their migrations, too). The most tricky part is of course a correlation between
the evolution of Y chromosomal patterns and the migrations of human populations
in the past (not even mentioning the correlation with the languages they spoke).
But I'm still convinced that some estimations are possible. And of course you
cannot do it using maximum likelihood network as the one from fig. 2 of
Underhill. Let's assume for a moment that the frequencies analyzed in that
figure correspond to Y chromosome haplotypes of all mammals found in different
regions (even better example would include all eukaryotic organisms, but
unfortunately most of them lack Y chromosome). How useful such correlation
could be when trying to reconstruct the evolution (and/or migrations) of the
ancestors of all horses, mice, or wales living today? But it doesn't
mean that by analysing all wild mice Y chromosomal haplotypes we wouldn't be
able to quite precisely reconstruct an evolution tree leading to Y
chromosomes of most species (or strains in case we analyze just one
species) living today (and maybe even roughly locate the nodes to the
geographical regions).
> The nodes, branches and the
haplogroups represent "defining mutations."
If this is true, what is the "defining
mutation" that defines the branch called "Europe" on fig.2 of Underhill? And
what about the branches called "Mideast" and "America"? And also, what is the
"defining mutation" for the whole branch that includes
subbranches "Mideast", "Morocco", "Basque" and "Europe"?
> It's
> basic principle of
cladistic phylogeny. This is precisely how Ringe attempts > to tree IE
(by "innovations.")
Unfortunately I don't know what are the
exact linguistic features that were analysed by Ringe. You can build a
cladistic tree for IE languages based on just one feature, if you are really
confident that the differences you observe in all separate branches correspond
to a progressive process that never (or almost never) goes back. (Is there any
feature that would meet this criterium? I suspect that it would be hard to reach
a consensus in this matter, but I don't exclude such possibility) If you have
two such linguistic features, and the trees based on their evolution are
not consistent, it means that at least one of those features does not meet the
criterium. Of course, the more features you analyse, and the more confident you
are that they meet the criterium, the more reliable your final result is.
However, there is one very important difference between such analysis made
for languages (and their features) and for regional populations (and their
genetic markers). In the case of languages, you assume that a language is a unit
that cannot be a descendant of two different ancestor languages (in case it
inherits features from two or more languages, you always choose only one
ancestor that seems to provide the "core" features - and defining these "core"
features is critical for distingushing related from unrelated
languages), because in such situation it would be impossible to draw a
classical cladistic tree. On the other side, when working with arbitraly
selected regional populations of Y chromosomes, there are quite frequent
situations, when we cannot select a "core" haplotype that would properly define
the position of the population as a whole. Most of the analyzed populations are
too large and heterogenous (for example the European population) to refer to
them as independent units and asign them single positions in the cladistic
tree.
> (I don't know what you think this
tree means, but yes it is a picture of the
> data, correlating the tree
in figure one with the geographical distributions
> in figure two. So,
yes, your Central Asia theory is rejected by the optimal
> cladistic tree
correlating regions and haplotypes generated by the data.
Let me know whether you still support this
view after you read what I wrote above?
> And
> that may surprise
you only because this "network is consistent with the first > two principal
components capturing 18% of the variation present in the 116
>
haplotypes." In other words, this is the best view of direction of spread
> that could be generated from the data and it only captures 18% of the
> variances. This low a participation is common in trees with a lot
of random
> distribution.)
And this is exactely why this method is
useless in this case. (Because of those numerous migrations in different
dirrections that cannot be correctly interpreted using this kind of maximum
likelihood network)
> And I've changed my mind.
The long Pakistani-Indian branch that off-shoots
> the "Mid East" branch
is clearly Sumero-Elamite-Dravidian. No doubt about
> it.
Just as these people felt their Y-Chromosomes changing, they definitely
>
started a new language family. Probably innovated a more manly, guttural
set > of sounds for use in their more aggressive verbal roots.
You don't need to be sarcastic. I've never
claimed that my hypothesis is anything more than just a speculation about
possible correlations between Y chromosomal markers, human migrations, and
evolution of languages.
Regards,
Michal