Interpreting the Y-Chromosome Research

From: x99lynx@...
Message: 13581
Date: 2002-04-30

Michal, first of all, I have to say I appreciate the effort you spent on
this. And I hope that you don't take what I say below as being
unappreciative of that effort. The basic problem with Underhill's data is
sample sizes and this affects specific conclusions severely. (E.g., note
that according to Underhill, the M46 mutation that you associate with Uralic
is not even found in Europe.)

The unreasonably large MODERN sample size for Central-Asia-Siberia pretty
much assures that it will show up in many of Underhill's haplotypes. BUT it
seems the researchers corrected for that when they built their "maximun
likelihood network" where Central Asia-Siberia plays a minor role. What is
glaringly absent however is a sampling category called the "Near East, which
may have changed the later distributions significantly.

I'm going to post that map so the apparent implications are clear. My own
conclusion is that this does not tell us much because the research was really
designed to confirm an out-of-Africa hypotheses and is statistically invalid
for more recent conclusions.

Michal writes:
<<The fifth major group that was formed during the "Near East split" is
characterized by the 09 polymorphism. Seems that this group moved toward
Central Asia, where it quickly expanded and split into at least 9 separate
branches.>>

Okay, freeze everything right there.

1. No, there is NO evidence that the "09" group you mention either QUICKLY
expanded OR QUICKLY split. We have NO information about the original
population of this group before it mutated in any way. We have NO
information about how quick it mutated or any real timeline. We have NO
direct knowledge of where 09 originated or where it went. There is no
mention of a "Near East split."

2. The Underhill phylogenic "tree is rooted with respect to non-human
primate sequences." This means that the timespan of this tree is relative
and not anchored to anything but an estimated rate of mutation based on
out-of-Africa and that rate is not supplied in this study. And we are not
talking about mutation in the entire organism, but in the Y-chromosome, so
the expected rate is even more ephemeral. When the original "09" mutation
occured is not given, but potentially it very well could have happened 25,000
years ago or more.

3. Michal, it should be apparent that the sampling of this study favors
finding varied haplotypes in Central Asia-Siberia. The sample from
Central-Siberia was 184 males. The sample from the Europe was 60. The
sample from the Mid-East was only 24. Since there were 116 haplotypes in
the study, the chances that the study would have missed prevalent haplotypes
in Europe and especially in the Mid-East is strong. That is probably why the
study itself only purports to draw conclusions about the out-of-Africa
hypotheses.

The ONLY reason you see "Central Asia" everywhere in the haplogroup tree is
because the SAMPLES FROM CENTRAL ASIA/SIBERIA MAKE UP ALMOST 20% OF ALL THE
SAMPLES from 21 regions sampled.

3. No, it does NOT seem that this original "09" group "moved towards Central
Asia." (In the Underhill study, "Central Asia" excludes "Pakistan +India" and
the "Mid-East", but includes "Siberia.")

In fact, to the extent that particular haplotype survives today (according to
the Underhill table), the largest apparent modern concentration of that gene
combination (marked 87 in Underhill) is in New Guinea (30%), with smaller
ratios in Cambodia/Laos (5%), Hunza (5%) and "Central Asia-Siberia"(3%).
Subsequent mutations of this founder gene does not affect that conclusion.

This means that when we locate the original unmutated "09" gene today, it is
overwhelmingly located in Southeast Asia. If this was a plant or an animal,
standard topographic distribution assumptions would place its origin in
Southeast Asia. (Modern % of population is not the best way to judge ancestor
populations but it is the only one that really allows any conclusions by this
data. Mutations are assumed to happen at the same rate in any location.)

If we step back A SINGLE NODE from your "09" group, to its immediate
predecessor (marked haplotype "71") we find that it survives in a
distribution that spans high concentrations in Morocco, the "Mid-east",
Europe, India-Pakistan, and a smaller ratio in "Central Asia-Siberia."
Despite the fact that "Central Asia-Siberia" has a small modern
representation in haplotype "71", it is still larger than that region's small
ratio of representation of the Haplotype "09 anchor" - haplotype 87. So
Central Asia seems to have lost concentration from the "09" mutation, not
gained it. In effect, 09 originally moved AWAY from Central Asia by this
data.

All this would strongly suggest that the original "09" haplotype was a
peripheral visitor in "Central Asia-Siberia." However, "09" is probably so
old (25,000BC?) that it has little or no bearing on the discussion. It seems
to have directly contributed more to the populations of New Guinea, Cambodia
and Japan than to anywhere else.

The real clue to how old the original "09" haplotype is comes from the fact
that several nodes down a mutation in this branch shows a high ratio of
presence to a modern Amerind populations. If we associate this with the
migration into America (15,000-8,000 BC?), the number of nodes separating it
from the 09 anchor is actually greater than the nodes separating 09 from an
African genesis. And if you are tracing languages back that far, you've gone
quite beyond the comparative method and even well beyond Nostratic.

Given all that about the 09 node, is there any better evidence of a "Central
Asian-Siberian" source for the much later haplotype 104? (Michal's mutation
M173.)

Haplotype 104's immediate ancestor survives in Haplotype 111. Today, the
rare haplotype 111 apparently finds its main concentration in modern American
Indians, to a much smaller degree in Pakistan-India and to a even smaller
degree "Cenral Asia-Siberian." Whatever we make of that, the origin of 111
is as likely to be Pakistan_India (2%) as it is to be Central Asia (1%).
Other haplotypes (e.g., 113) could not have given rise to 104. I would also
question whether a better sampling of the "mid-East" would not have yielded a
truer origin for 111.

A good date for this particular node is well before 8000BC - when migrations
into America may have started. 111 would seem to have originated deep in the
Ice Age - 15,000 BC. 111 is the founder haploid for 104 and all of
Underwood's Haplogroup XI. Consistently, it seems in these particular
Haplogroups, Pakistan + India seems the core of the new haplotypes, which
makes sense in terms of its probable population density and therefore
opportunity for mutation.

In Underwood, modern Haplotype 104 (mutation M173 alone) has the highest
ratio of concentration among "European" (50%) and "Basque" (60%) populations,
with about 8% in the MidEast, about 6% in India-Pakistan, about 5% in
"Central Asia-Siberia" and Morocco. In normal plant and animal
distributions, this would suggest that the defining mutation occurred in
Europe.

A later mutation (M17) (haplotype 108) is the only M173-based group that
shows strong eastern presence. It yields a 31% representation in
Pakistan-India and a 16% ratio in Central Asia, again suggesting a
Pakistan-India origin. The modern distribution in Europe according to
Underhill's sampling is 5%. (Compare the numbers you supplied.)

Is it reasonable to connect any of this with language? Back in 25,000BC
maybe. But the difference between M46 and M173 was probably lost the minute
two humans had to talk about how they were going to get something to eat that
day. The fact is that this Y-chrome trait is NOT necessarily correlated to
ANY OTHER physical traits. Even if such an invisible genetic difference once
meant something, it would have been swallowed up by necessity, culture and a
common language.

Regards,
Steve