Morphology (1/20)

I will present here (in approximately 20 chapters) my current views on (Pre-)PIE
morphology. Comments are, of course, welcome.

================================================================================

PIE Morphology

In the following, the emphasis will be on trying to recover the agglutinative
patterns which lie hidden behind the complex flexional morphology of
Proto-Indo-European as it can be reconstructed from the attested daughter
languages.

I.THE VERB

PIE verbal morphology distinguished three "voices": active, middle and stative.
There is evidence for three moods: indicative, conjunctive (= subjunctive) and
imperative, as well as for some other emerging modal categories (optative,
causative, desiderative). Only two tenses or aspects were initially
distinguished (present/progressive vs. preterite/punctual), but in most daughter
languages new temporal and aspectual categories (aorist, perfect, imperfect,
future, iterative etc.) were created using different means. There were thematic
and athematic declensions, but the thematic indicative was not always formally
distinguished from the conjunctive. In the finite forms, three persons and three
numbers (singular, dual and plural) seem to have been distinguished. The
non-finite forms include verbal nouns (infinitives, etc.) and adjectives
(participles).

The verbal stem is composed of the root (simple or reduplicated), followed by
optional suffixes (modal, thematic). The endings express person, number, tense
and "voice" of the finite verb. A number of languages show a prefixed "augment"
*h1e-, marking past tense.

As in the noun, we can distinguish between different accentual patterns.
Thematic verbs have fixed accent (either on the root or on the thematic vowel,
as in the Sanskrit <tudáti> type). Athematic verbs usually accentuate the root
in the strong forms (those with asyllabic endings: the active/stative singular,
the 3sg. imp.), but the endings in the weak forms (those with syllabic endings:
the active/stative dual/plural and all the rest). If there is a suffix, it
usually takes the accent instead of the root (as in the hysterodynamic nominal
type). To my knowledge, there is no clear evidence for a verbal proterodynamic
type (which would have had full grade of the root and zero grade of the suffix
in the singular, zero grade of the root, full grade of the suffix and presumably
o-grade of the endings in the plural etc.). A small number of verbs ("Narten
presents") accentuate the root throughout. Probably they had a long vowel (*e:
or *o) in the strong forms, which was shortened in the weak forms but attracted
the accent back to the root (by Rasmussen's initial accent rule). However, the
endings of these static verbs (where we would expect zero grade or o-grade in
the ending) do not seem to differ from those of other verbs, except in the
reduction of 3pl. *-ent to *-n.t (which also happens in all reduplicated verbal
forms). The stative, if different in origin from the "Narten presents" at all,
had a similar structure (as in the Hitt. hi-conjugation with Ablaut sg. *o ~
pl. *e, perhaps from **a: ~ **a).

1. The present athematic

The athematic present was formed by affixing personal pronouns (**-mu, **-tu)
and number markers (**-an, **-ik) to the verbal stem. The third person was
initially left unmarked, but later *-t(V) was added (presumably a form akin to
the demonstrative pronoun *to-). That this was a later addition follows from
the fact that in the plural it comes _after_ the number marker *-an(a) (> 3pl.
*-én-t-). Subsequently, final vowels were lost, and the resulting 2sg. final
*-tw developed into some kind of sibilant, marked here as *-c(w) (> *-s(w)).
Finally, a suffix *-i was added as a "present tense" marker. Typologically, it
is highly unlikely (although not impossible: this _is_ what is attested in e.g.
Hittite) that an unmarked past contrast with a marked present . In origin, the
*-i, identical to the locative suffix, may therefore have been used to mark the
progressive (cf. the Welsh progressive <yr wyf i yn darllen> = "I am
a-reading"). In time, this became the normal present tense. The typological
abnormality was remedied by a tendency to partially lose the distinction between
endings marked with *-i ("primary endings") and not so marked ("secondary
endings"), e.g. in Celtic and Italic, and the creation of new, more marked, ways
of expressing the past tense (e.g. the augment, the s-aorist).

I would reconstruct the most remote stage of the active endings as follows:

sg. pl. du.
**'-mu **-mu-án(a) **-mu-ík(a)
**'-tu **-tu-án(a) **-tu-ík(a)
**'-0 **-0-án(a) **-0-ík(a)

In the singular, the accent was on the verbal stem (giving e-grade normally,
elsewhere the accent was on the endings, later resulting in zero grade).

Second stage, after loss of final vowels and various Auslaut developments:
sg. pl. du.
**-mw **-mwén **-mwáh2
**-cw **-twén **-twáh2
**-t **-én-t **-yéh2-t

Third stage, after addition of *-i:
sg. pl. du.
**-mwi **-mwéni **-mwáh2i
**-s(w)i **-t(w)éni **-t(w)áh2i
**-ti **-énti **-yéh2ti

This comes close to the Anatolian paradigm. In Hittite, we have (for the verb
es- "to be"):
sg. pl.
1 es-mi as-weni, as-wani
2 es-si as-teni, as-tani
3 es-zi as-anzi

The third person endings -zi and *-anzi (<z> to be read as /ts/) are affected in
Hittite by the recent palatalization of *t before i (Luwian has -ti and -nti).
Comparison with other PIE languages shows that the 3rd person plural must be
reconstructed as *-énti (the a-vocalism in -anzi is perhaps an intrusion from
thematic *-onti). The labialization of the first person marker *mw is best
demonstrated by comparing Hittite -mi, -weni with the Luwian 1st person sg. -wi
and 1.pl. *-mani (implied by attested 1pl. preterite -man). Similarly, the rest
of IE shows *-m- in the sg. and pl., *-w- in the dual. The labialization in the
second person has left fewer traces, besides of course the assibilation of
final -tw to -s(w). Perhaps it is seen in the 2nd person plural, where /e/ may
have been rounded to /o/ in the variant form -tani besides -teni, as in the case
of the 1st person plural form -wani , for which we can also compare Italic,
Celtic and Slavic 1pl. *-mos (besides elsewhere *-mes).

The model outlied above explains the singular and 3rd person plural forms, but
the other plural forms and the whole of the dual followed for the most part a
different path of development outside of Anatolian (which in any case offers no
evidence for the dual).

It is evident that *-én, which does not occur in PIE as a plural marker on nouns
or pronouns, was no longer recognized as a plural marker early on, as witnessed
by the fact that *-t was added after it in the 3rd person plural. The regular
development of final *-n into *-r (except after *m, as in the 1st person plural)
led to further loss of clarity in the past tense forms (those without *-i),
leading to an uncomfortable contrast between present tense *-mwéni, *-t(w)éni,
*-énti versus past *-mwén, *-tér and *-ér. An alternative set of forms using
the more usual (pro-)nominal plural element *-és(w) as a present tense marker
(for the past, perhaps by analogy with the perfect, *-é was used), came to be
used both in the plural and the dual:

Present endings, "Set II":
sg. pl. du.
[**-mwi] **-mwésw **-mwh2ésw > *-wás
[**-swi] **-t(w)ésw **-t(w)h2ésw > *-thás
[**-ti] [**-énti] **-h2tésw > *-tés

This explains most of the non-Anatolian forms (see below), except perhaps for
the curious phenomenon that a number of languages (e.g. Greek) maintained an
asymmetry between the first and second persons plural, with *-mes/*-men for the
former, and *-te (apparently the secondary ending) for the latter. This can
perhaps be explained as influence from the 2pl. conjunctive (which has secondary
endings).

The Hittite forms were given above. We will next review the attested forms in
the other languages in the light of the model outlined above.

In Tocharian, the present athematic endings are best preserved in Tocharian A
(Tocharian B has generalized the past endings). A sample paradigm (läka:- "to
see"):

lka:-m lka:-mäs
lka:-t lka:-c
lka:-s. lke-ñc

The most striking aspect about the Toch. A paradigm is 2nd sg. -t. It is
unclear whether this is an intrusion from the stative 2 sg. in *-th2(a), or a
subsitution of the ending by an agglutinated 2 sg. pronoun *tu (something which
is most likely to happen in the second person, as we can see for instance in
Germanic 2sg. -st, Welsh 2pl. -ch, or Catalan 2sg. ets < es-t "you are"). The
3rd sg. endings (Toch A -m., Toch B -s.) were originally particles (perhaps -m.
< *nu, -s. < *se), added after the reduction to zero of the original 3rd person
ending (whether this was active *-t or stative *-e).

The 3rd pl. ending -(i)ñc is regular from *-énti. The 1st and 2nd plural
endings -mäs and -c come from *-mesi and *-te[s], respectively. The form *-mesi
(compounded from *-mes from "Set II" with the *-i from "Set I"), has an exact
counterpart in Indo-Iranian *-masi.

Germanic has preserved only a few traces of the athematic present. The Gothic
paradigm of "to be" is:

1 im siju sijum
2 is *sijuts sijuþ
3 ist sind

The endings are unproblematical, and can be derived from something like *-mi,
*-si, *-ti, pl. *-me[n], *-te[n], *-nti, du. *-wa. The 1st and 2nd person
plural cannot reflect the "Set II" forms *-mes, *-tes (which would have given
Gothic *-ms, *-ts), but the dual forms (1du. thematic -os < *-o-wVs, 2du. -ts <
*-tVs) do reflect the expected "Set II" forms (*-wás, *-tás).

In Armenian, the athematic paradigm of "to be" has become closely intertwined
with the thematic conjugation. We have:

"to be" thematic presents:
em < *es-mi -em
es < *es-si -es
ey < [*es-e-ti] -ey
emk` < *es-mVs -emk`
eyk` < [*es-e-tes] -eyk`
en < *es-enti -en

The 3rd sg. and 2nd pl. endings -y, -yk` are from the thematic conjugation,
while the endings of the other persons have been carried over from the verb "to
be" to the thematic verbs. The 1st and 2nd plural endings *-mes and *-tes have
final -k` in Armenian, which identifies them as plural /s/'es (other kinds of
*-s give Armenian zero or -r).

The Old Irish verbal system is characterized by an opposition between conjunct
and absolute conjugations. The absolute is used when the verb stands alone at
the head of a sentence (VSO is the default word order in [Insular] Celtic),
while the enclitic conjunct form is used when a negation, preverb, or other
particle opens the sentence. It is clear that the absolute forms are based on
the addition to the normal, conjunct, form of an element *es or *is (I would
guess the copula *est, which in Old Irish was still used to mark a part of
speech that had been pulled to front of the sentence):

a-stems i-stems
conjunct: absolute: conjunct: absolute:
-(a)im(m) ,, -im(m) ,,
-(a)i ,, -i ,,
-a -(a)id -i -id
-am -m(a)i -em -immi
-(a)id -th(a)e -id -the
-at -(a)it -et -it

For the conjunct forms, straightforward reconstruction gives 1st and 2nd sg.
*-mi and *-si while the endings of the third person appear to have lost final
*-i. So:

*-a:-mi > *-am^ > -(a)im
*-a:-si > *-ai > -(a)i
*-a:-t > *-a > -a
*-a:-mo[s] > *-am > -am
*-a:-te[s] > *-at^ > -aith, aid
*-a:-nt > *-add > -at

The gemination in such by-forms as -(a)imm, -imm derives from the paradigm of
the verb "to be", where *esmi gives amm.

In the absolute forms there is evidence for a long vowel (resulting from
contraction of the final vowel of the ending with the added element *IS/*ES) in
the 2sg. 1pl. and 2pl., but not in the 1sg.:

-a:-m + ES > -a:m^is^ > -a:m^ = -(a)im
-a:-si + ES > -a:i:s^ > -a:i(:) = -(a)i
-a:-t + ES > -a:t^is^ > -a:t^ = -(a)ith
-a:-mos + ES > -a:mois^ > -a:moi > -m(a)i
-a:-te + ES > -a:t^e:s^ > -a:t^e: > -the
-a:-nt + ES > -a:dd^is^ > -a:dd^ = -ait

Latin, and Italic in general, share with Celtic the tendency to lose the
opposition between primary endings (with *-i) and secondary endings. In the 3rd
person, a distinction is still attested in Old Latin between present tense -t
(from *-ti) and past tense -d (from *-t). The verb "to be" is conjugated in
Latin as:

s-um < *s-m. s-umus < *s-m.os
es < *es-s es-tis < *es-tes
es-t < *es-t s-unt < *s-n.t,

This provides some evidence for the use of this verb as an enclitic (1sg, 3pl.).
The 1st and 2nd persons singular use the "secondary" forms (without *-i).

In Slavic, we have the following forms in the present tense of the verb "to be":

sg du pl
esmI esvê esmU
esi esta este
estU este so~tI

The 1st and 2nd plural forms point to *-mos and *-tes (or *-te). The dual
endings are secondary endings, except that the 1st person du. has -vê for
expected *-va under the influence of the pronoun vê "we two". The really
problematical form is 2sg. -si for expected *-sI. Slavic -i may come from PIE
*-i:, *-ei, *-e:i, *-je:, *-joi, *-jai, or, in some cases, *-oi or *-ai. On
internal Slavic evidence a connection with 2sg. optative/imperative *-yeh1-s
might seem attractive, given that aberrant forms of the second person singular
are often to be explained as intrusions from modal forms [or from the plural, or
from pronoun agglutination]. The Baltic evidence, however, unmistakeably points
to *-e(:)i as the common source.

The Lithuanian athematic endings can be reconstructed (on the basis of the
reflexive verb which preserves long vowels and diphtongs that are otherwise lost
in final syllables) as:

*-mi *-me: *-va:
*-sei *-te: *-ta:
*-ti -- --

Baltic has lost the 3rd person du. and pl., for which the sg. is used. As in
Slavic, the dual endings are secondary endings. Interesting is the paradigm of
"to be" in Old Prussian:

asmai asmai
asei astê
asti --

The first person forms are enigmatic. One theory holds that 1st person sg.
asmai stands for /asma:/ and really represents the same form as Latv.,
Lith.dial. esmu, a cross between athematic *es-mi and thematic *es-o:. If so,
the same reasoning can be applied to the 1pl. form, which would represent *-mo:,
a rounded variant of Lith. -me:. Comparison between Baltic *-me: ~ *-mo:, *-te:
and Slavic *-mos, *-tes would then seem to indicate that Baltic, for reasons
unknown, (this is certainly not a regular phonetic development in Baltic),
replaced *-s(w) in these plural forms with vowel lengthening. There is no
connection with the OHG 1pl. ending -me:s, which is likely to represent an
agglutinated pronoun (-m(e) we:s > -me:s), presumably in order to obtain a
better distinction between 1sg. and 1 pl. in the athematic verbs: OHG salbo:-m ~
salbo:-me:s.

Returning to the enigmatic Balto-Slavic 2sg. athematic ending *-sei, the only
explanation I can give is that this is indeed an intrusion from a conjunctive
form, namely the stative conjunctive (more on which later), which had 2sg. *-é:,
giving *-e:i (> BS *-ei) when extended with primary *-i. If the -i in the Old
Prussian spelling -mai represents a phonetic reality, the 1sg. and 1pl. must
also have added this -i to original -o: (or perhaps to stative conj. 1sg. -a:)
and 1pl. -mo:.

Albanian has a peculiar ending -ni in the 2 pl. It is tempting to see in it a
continuation of Hittite -teni, but given the existence of -të or -ët in
non-present forms (pointing to plain extra-Anatolian *-te), an origin in
agglutinated *-nu: "now" (Rasmussen) is probably preferrable.

Greek has the following present paradigm of the athematic verb "to be":

ei-mí -- es-mén ~ es-més
es-sí ~ esí > eî es-tón es-té
es-tí es-tón *es-enti > eisí

Interesting is the presence in Greek of both -mes (Western) and -men (Eastern)
in the 1st person plural. The dual forms are originally middle dual forms
(which I reconstruct as 2/3 du. subject *-(h2)t(w)om, *-(h2)tom, more on which
later).

The Gathic and Vedic paradigms:

ah-mi *vahi < *h-vahi mahi < *h-mahi
a-hi -- s-ta
as-ti *s-to: < *s-tas h-anti

ás-mi s-vás s-más(i)
á-si s-thás s-thá(na)
ás-ti s-tás s-ánti

Besides the mixed 1pl. form *-mási (*-méni x *-més), the only problematic form
is 2pl. -thá, with variant -thána, for expected *-tás. Perhaps the identity
with 3du. -tás prompted the replacement of the 2pl. form (although one would
sooner expect the dual to yield). Perhaps it's simply the same avoidance of
*-tés and adoption of secondary *-té that we see in Greek. If so, the need to
consistently distinguish primary and secondary forms led to the replacement of
*-té in the present by *-th2á, originally a secondary dual form.

As to -thana (also -tana in the past tense), it is hard, despite the different
vocalism, not to think of a connection with "Set I" ending *-téni. Perhaps we
are dealing with an (object?) pronoun -a (i.e. the anaphoric pronoun *-e ~ i),
which would already have been in close attachment to the verb at a time before
the sound law -n > -r began to work. If so, the link between present tense
-tani+a and past tense -tan+a would have remained unaffected by the soundlaw in
this form (as opposed to unextended *-tani ~ *-tar), leading to the preservation
of -tana in Indic (with its present tense counterpart *-tania later remodelled
analogically to -thana).

=======================
Miguel Carrasquer Vidal
mcv@...