Re: Proto Vedic Continuity Theory of Bharatiya (Indian) Langauges

Dear Dr. Kalyanaraman (are you not a member of this List?),

--- In cybalist@yahoogroups.com, "mkelkar2003" <smykelkar@...>
wrote:

> What is a substrate

Check it out on your linguistic handbooks.

> and what does Nahali become when it has absorbed > several layers

from substrates?

It becomes a typologically New Indo-Aryan language in which
different layers of prehistoric, ancient and medieval substrates are
recognizable (or non-recognizable, as in the case of the so-
called "Proto-Nahali" substrate which some linguists also
call "Proto-Indic").

> Where is this substrate? > Can this be identified and isolated for

all bharatiya languages?

No, the "Proto-Nahali" substrate can be identifired for Nahali only.

> Why should Nahali be seen to have absorbed from Marathi? Why not
> vice-versa? Is it not possible that the formation of Marathi

language > had its roots taken from Nahali substrate [...]?

One should carefully examine Nahali word-lists to answer this query,
but I am pretty certain that the Marathi and Hindi vocabulary (as
well as morphological and syntactical structures) in Nahali are
common to other New Indo-Aryan languages, and can be explained away
as being derived from Middle Indo-Aryan, in turn derived from Old
Indo-Aryan, in turn derived from Indo-Iranian, in turn derived from
Proto-Indo-European. This should suffice to exclude the possibility
of New Indo-Aryan languages having "continued", by a process of
linguistic accretion and exchange, (see Alinei's "Palaeolithic
Continuity Theory"!) a prehistoric Nahali substratum.

> Why should Colin Masica's language x be restricted to only hindi?

Why > not look upon this language x as the bharatiya 'substrate', the

> proto-vedic?

Masica's "Language X" is not "Proto-Vedic" -- a definition which
means nothing to me: there is only "R.gvedic", a kind of "Old Indo-
Aryan". R.gvedic is the language attested in the R.gveda, the oldest
Indo-Aryan text available to us. If you wish, you can use the
term "pre-R.gvedic" to indicate the reconstructed proto-forms of
R.gvedic lexemes and morphemes that are no longer recognizable as
belonging to (reconstructed) common Indo-Iranian.

"Language X" is defined as a substrate that is found at the bottom
of the agricultural vocabulary of Hindi and some neighbouring New
Indo-Aryan languages spoken in the Gangetic plains. Some 30% of
Hindi agricultural vocabulary are neither Indo-Aryan nor Dravidian
nor Munda, and are, therefore, held by Masica to stem from the
unknown substrate language he labels as "X". The vocabulary
of "Language X" also includes some terms relating to artisans, local
flora and fauna, clothing, household, dancing and music. The only
traces of this substrate in the R.gveda are represented by a handful
of words with the typical "Language X" geminates (see below).
Therefore, no "Language X" = "Proto-Vedic".

If Hindi and other Gangetic Indo-Aryan languages were studied the
same way as Kuiper and others have studied the succession of
historical layers in Nahali -- earliest unknown substrate ("Proto-
Nahali"), "Para-Munda", Dravidian, Korku (North Munda), New Indo-
Aryan (Marathi, Hindi dialects) -- one would most likely find in
them similarly stratified layers of Masica's "Language X", "Para-
Munda", Old Indo-Aryan, early Persian, and Greek loans, Sanskrit
loanwords, medieval loans from Arabic, Turkish, Mongolian and
Persian. This linguistic work has not been done to date.

Here is an assessment on the "Language X" substrate made by M.
Witzel:

http://users.primushost.com/~india/ejvs/ejvs0501/ejvs0501c.txt
<< $ 2.4. Substrates of the Lower Gangetic Plains and "Language X".

Next to the Mundas, there must have been speakers of other
languages, such as Tibeto-Burmese, who have left us names such as
kosala, kauzikI (mod. kosi), perhaps also kAzi and kauzAmbi (mod.
kosam), from Himalayan khu, ku (Witzel 1993). In IA they also have
left such words as the designations for cooked rice IA *cAmala and
probably also PS zAli 'rice'.

In Uttar Pradesh and North Bihar (attested in Middle and Late Vedic
texts, c. 1200-500 BCE) another apparent substrate appears in which
the 'foreign' words do not have the typical Para-Munda structure,
with the common prefixes, as described above. Masica (1969) called
this unknown substrate "language X". He had traced it in
agricultural terms in Hindi that could not be identified as IA,
Dravidian or Munda (or as late loans from Persian, S.E. Asia, etc.).
Surprisingly some 30% of the terms are of unknown, language "X"
origin, and only 9.5% of the terms are from Drav., something that
does not point to the identity of the Indus people with a Drav.
speaking population.

However, only 5.7% of these terms are directly derived from Munda.
Obviously, the pre-IA population of the Gangetic plains had an
extensive agricultural vocabulary that was taken over into all
subsequent languages. F.B.J. Kuiper has pointed out already in 1955:
137-9 (again in 1991: 1) that many agricultural terms in the RV
neither stem from Drav. nor from Munda but from "an unknown third
language" (cf. Zide & Zide 1973: 15). This stratum should be below
that of Para-Munda which is the active language in the middle and
late Vedic texts.

Again, it has been Kuiper who has pointed the way when he noted that
certain 'foreign' words in the Vedic substrate appear with geminate
consonants and that these are replaced in 'proper' Vedic by two
dissimilar consonants (1991: 67). Examples include: pippala RV
(1.164.20,22; 5.54.12, su- 7.101.5 ) : piSpala AV (in Mss.)
9.9.20,21; 6.109.1,2; su-piSpala MS 1.2.2:11.7, guggulu AV, PS :
gulgulu KS, TS, kakkaTa PS 20.51.6, KSAzv. : katkaTa TS. Kuiper adds
many other cases of Vedic words that can be explained on the basis
of words attested later on.

In RV geminates also occur in 'onomatopoetic' words: akhkhalI-kR 'to
speak haltingly' or 'in syllables?', cf. now Nahali akkal-
(kAyni) '(to cry) loudly in anguish' MT II 17, L 33 (kAyni < Skt.
kathayati 'to tell' CDIAL 2703, cf. 38) MT II 17; cf. also jaJjan-
RV 8.43.8 etc., ciccika 10.146.2 'a bird'?, and cf. also azvattha
1.135.8 : azvatha a personal name, a tree, 6.47.24, with unclear
etymology, (Kuiper 1991: 61, 68).

Post-RV, new are: hikkA PS 4.21.2, kakkaTa PS 20.51.6 (MS kakuTha,
TS katkaTa!), KSAzv in YV: kikkiTA KS, TS, kukkuTa VS, pilippilA TS
7.4.18.1, cf. also TS Akkhidant, prakkhidant TS 4.5.9.2, Ajjya
5.2.7.3. Especially interesting is the early gemination *dr >
ll: kSullaka AV 2.32.5, TS 2.3.9.3 kSullaka, < kSudra 'small' (a
children's word?); later on, among others, bhalla-akSa ChU4.1.2,
bhalla Br., MBh (with variants phala, phalla! EWA s.v.); JB malla 'a
tribe' (in the Indian desert, Rajasthan; cf. DEDR 4730), etc.

Though certain geminates, especially in word formation and flexion (-
tt-, -dd-, -nn- etc.), are allowed and common, they hardly ever
appear in the stem of a word (Sandhi cases such as anna, sanna etc.
of course excepted). Until the late BrAhmaNa texts, other geminates,
especially bb, dd, gg, jj, mm, ll, but also kk, pp, etc., are
studiously avoided, except in the few loan words mentioned above
(pippala, gulgulu, katkaTa etc. (Kuiper 1991: 67 sqq.).

It will be readily seen that Kuiper's seminal observation reflects a
tendency that can be observed throughout the Vedic texts. Geminates,
especially the mediae, apparently were regarded, with the exception
of a few inherited forms such as majj 'to dive under', as 'foreign'
or 'barbaric'. They did not agree with the contemporary Vedic (and
even my own) feeling of correct speech (Sprachgefu"hl).

However, starting with Epic Sanskrit, forms such as galla, malla,
palla, etc. are normal and very common (however, -mm-, perhaps
regarded as Drav.(?) remains rare); such words, in part derive from
normal MIA developments, in part from the substrate.

This tendency can be sustained by materials from various other
sources. In the language 'X' only a few of Masica's agricultural
substrate words that do not have a clear etymology (1969: 135)
contain such geminates: Hindi kaith < Skt. kapittha CDIAL 2749
(Mbh), piplI/pIplA < pippala (RV), roTI < *roTTA, roTika 10837
(Bhpr.); karela < karella/karavella 3061, khAl < khalla 3838-9
(Suzr.); to these one can add the unattested, reconstructed OIA
forms (Turner, CDIAL, see Masica 1969: 136): *alla CDIAL 725,
*uDidda 1693, *carassa 4688, *chAcchi 5012, *bAjjara (see, however,
OIA *bAjara, 9201 bAjjara HZS: varjarI!), *balilla 9175, *maTTara
9724, *suppAra 13482, *sUjji/sOjji 13552. However, these words have
come into NIA via MIA, and that their geminates may go back to a
consonant cluster without geminates (see below, on Turner's
reconstructs).

All of these tendencies are reconfirmed by what we can discern in
the other substrate languages. While there still are but a few cases
in the northwest, the substrates located further east and south all
have such geminates. (Incidentally, the northwest has retained the
original, non-geminate consonant groups, such as -Cr-, to this day,
cf. Khowar bhrar, Balkan Gipsy phral 'brother', W. Panj. bhrA, E.
Panj. bh(a)rA : Hindi bhAI, etc.).

In the unstudied substrate of the Kathmandu Valley (inscriptions,
467-750 CE, see below), geminates are found in the following place
names: gamme, gullataMga, gollaM, jajje-, dommAna, daGkhuTTA-,
bemmA, cf. also bhumbhukkikA (onomat. with double consonant: <
*bhumbhum-ki-kA?); cf. also village names such as joJjon-diG, tuJ-
catcatu, thuMtuM-rI, daNDaG-(guM).

In the substrate of modern Tharu: e.g. ge~TTI, ghaTTI, TippA (?),
ubbA; cf. also 'onomatopoetic' words such as jhemjhemiyA 'small
cymbal or drum', bhubhui 'white scurf', gula-gula 'mild' (with the
usual middle Vedic, OIA, Tamil, etc. form of the "expressive" and
onomatopoetic words: type kara-kara versus older Vedic bal-bal).

In modern Nahali (Kuiper 1962: 58 sqq., 1966) the following
substrate words can be found, though apparently various types of
consonant groups are allowed: bekki, beTTo, bokko, coggom, cuTTi,
joppo/jappo, kaggo, kAllen, maikko, oTTi, poyye, unni. Additions to
this list can easily be supplied now from that of A. Mundlay (MT II)
which are not obviously from NIA include 8 aDDo, 91 attu', 182
bekki, 203 beTTo, 221 bijjok, 232 biTThAwi, 255 buddi, etc.

In the Drav. Nilgiri languages (Zvelebil 1990:63-72) there are a few
isolated geminating words that go back to a pre-Drav. substrate,
e.g. Irula mattu 'lip', Dekkada 'panther', muTT(u)ri 'butterfly',
vutta 'crossbar in a house'.

The Vedda substrate contains the same type of words:: cappi 'bird',
potti 'a kind of bee', panni 'worm' (de Silva 1972: 16).

Finally by way of appendix, in the isolated Andamanese language (Aka
BIada dialect), a few consonant groups seem to be allowed, but
hardly any geminates are found (Portman 1887): dAkkar-da 'bucket'
p.18, kAttada, badda 'crab' 22, chetta-da 'fruit' 34, tokko dElE
kE 'to go along the coast', chetta-da 'head' 36, sissnga kE 'to
hiss' 38, udda 'maimed' 48, peggi 'many' 48, teggi lik
dainga 'noise' 52, teggi lik dainga kE 'to obey' 54, molla-
da 'smoke' 72, tekke yAbadO 'straight' 78.

It can be stated, therefore, that the substrate languages outside of
the extreme northwest indicate broad evidence for original
geminates. Differently from IA (cf. below, on Turner's
reconstructions), these words have not been pushed through
the 'filter' of MIA, that means their original consonants clusters
have not been 'simplified' (e.g. kt > tt, kS > kkh, etc.) Such
striving for simpler syllable structure is known from many
languages, e.g. Latin noctem > Italian notte, French nuit [nu"i], or
O.Tib. bgryad > Tib. [y] 'eight', Jpn.-Austro-Thai *krumay > Jpn.
kome 'rice' (Benedict), Kathmandu Valley substrate kicipriciG(-
grAma) > Newari kisipi~Di, etc. Even then, the tendency seems
especially strong in S. Asia and probably has worked on IA from the
beginning, as for example in the early example AV kSullaka <
kSudraka. In Drav. various consonant groups are allowed, including
geminates (Zvelebil 1990: 10 sqq.:) e.g., kakku, kaccu, kaTTu,
kattu, kappu, kammu; (cf. also the interchange p- :: -pp-/-v- :: -p/-
u).

One can therefore put the question whether this old substrate
tendency has already influenced the Para-Munda of the RV. In Munda
itself, such geminates are very rare (cf. Kuiper 1991: 53), and open
syllables are common. However, there is a tendency in the Munda
languages to eliminate consonant groups caused by vowel loss in
prefixes (Pinnow 1959: 457); this does not cause geminates in such
cases but is in line with the similar developments from Old to
Middle and New IA (e.g. akSi 'eye' > akkhi > A~kh, rakta 'colored,
red' > ratta > rAt, etc.). One may therefore explain many of
the 'foreign' words with geminates in Vedic and post-Vedic,
excluding Drav. loans, in the same way.

For the same area that is covered by Masica's language "X", and for
N. India in general, one may also adduce the many words in NIA that
are not attested in Vedic, Classical Skt. or the various MIA
languages such as Pali but that occur only in their NIA form. They
have been collected and reconstructed by V. Turner in his CDIAL.
These include the starred forms, appearing in their reconstructed
OIA form, and those words that do not appear in Ved. but are more or
less accidentally attested in late Skt. texts, and the substrate
words dealt with by Turner. They have a typical, often non-IA
structure, including the very common cluster -ND-, -TT-. Their root
structure follows the following pattern. (C = any consonant, @ any
vowel)

*C@..., C@..., C@..., C@..., C@..., C@..., C@..., C@..., C@..., C@..., C@...,
C@..., C@..., C@..., C@..., C@..., C@..., C@..., C@..., C@..., C@..., C@..., C@...,
C@..., C@..., C@..., C@..., C@...

In Turner's CDIAL there are only a few forms such as *Cr@..., Cr@...,
Cr@..., Cr@..., Cl@...; this does not surprise as all reconstructed
words have passed through the filter of MIA and have lost such
clusters, -- except in the extreme northwest (Lahnda and Dardic).

Double consonants at the end of roots may go back to complicated
clusters that can no longer be reconstructed, for example *C@... <
**C@... (cf. RV kSviGkA, ikSvAku, and compare Ved. clusters such as
matkuNa, matkOTaka, kruJc). Consonant clusters with various
realizations in pronunciation may also be hidden in many Vedic loan
words (Kuiper 1991 : 51 sqq., Ved. cases p. 67 sqq.). >>

Kindest regards,
Francesco Brighenti