[tied] Re: pre-Nostratic *male[:]k?xa, 'milk (vb.)'

--- In cybalist@yahoogroups.com, "Richard Wordingham" <richard@...> wrote:

>
> --- In cybalist@yahoogroups.com, "Brian M. Scott" <BMScott@> wrote:
> >
> > At 9:30:27 PM on Sunday, February 12, 2006, mkelkar2003
> > wrote:
> >
> > > --- In cybalist@yahoogroups.com, "Brian M. Scott" <BMScott@> wrote:
> >
> > >> At 6:21:19 PM on Sunday, February 12, 2006, mkelkar2003
> > >> wrote:
> >
> > [...]
> >
> > >>> Of the 35 most basic word vocabulary PIE and OC have 23%
> > >>> cognates. That is too many for a chance occurence.
> >
> > >> This is a non sequitur. It's also obviously incorrect:
> > >> apart from a tiny handful of possible borrowings, there are
> > >> *no* known cognates, since OC is not known to be related to
> > >> PIE at all.
>
> Let us remember that the chance similarity when applying comparisons
> by Swadesh's rules is about 8% For 35 meanings, that means an average
> of 2.8 matches could be due to chance! 23% of 35 means 8 items matched.
>
> The probability of getting 8 or more matches out of the 35 is about
> 0.8%, which is impressive. However, that is the *best* of 5
> comparisons. It's not too many for it be mere chance, but I'm not
> sure how many comparisons were done to get a good score. The problem,

The highest number of matches was for OC and TB 74% (Table 2) which
means about 27 words match. I do not know if linguists consider these
families to be genetically related. You have raised a good point
about sample size.

"but I'm not

> sure how many comparisons were done to get a good score"

Chi square tests requires enough data points to fill every cell. With
thousands of words any language has i do not see that as a problem.
Ch squareIts merely a test of association not causation. For example
for exploring relationship between geneder and viewing the Superbowl,
the following table would yield a signitcantly hi chi square with a
probability of less than 0.05.

Male Female

watch Superbowl? Y 80 20
N 20 80

which means Males are more likely to watch the Superbowl but not that
you watch superbowl *because* you are a male!

The following table would give no relationship. A zero chi square an p
value=1

Male Female

watch Superbowl? Y 50 50
N 50 50

Now the fact that Wang (n.d) has quoted these percentages means that
they are all statistically significant that is not by chance.

m. kelkar

> as my stats lecturer said, is that 'all samples are peculiar'. Theory
> says you form the hypothesis before you look at the data; practice
> says you get the data first. Unfortunately, in historical sciences
> you often cannot gather a fresh data set to test an idea formed by
> looking at a set of data.
>
> > > Refer to Table 2 below:
> >
> > > http://www.ee.cuhk.edu.hk/~wsywang/publications/lg_diversity.pdf
> >
> > >> Presumably you mean that 23 items are superficially
> > >> similar.
> >
> > It would be nice if you'd stop wasting my time: the caption
> > over Table 2 makes it clear that the table refers to items
> > that are superficially similar, not to demonstrable
> > cognates.
>
> That's because there aren't enough possible cognates to go very far
> and confirm the correspondences. The idea is that the 35 meanings are
> those that keep their primary words best. Most of the problems that
> Mark Rosenfelder refers to at www.zompist.com do not apply to this
> comparison. However, the title of 'Apparent Cognates' is immediately
> justified by the method - you could very well be looking at 5 cognates
> and 3 coincidences even if the link between IE and Sino-Tibetan is

valid.

>
> However, this is getting off-topic, and should be taken to Nostratic-l
> or even further afield, e.g.
> http://groups.yahoo.com/group/Macro-Familes-L . Prompt, justified
> objections to my mathematics are acceptable, though.
>
> Richard.
>