Diffusion

Linguistic areas

In many parts of the world, extended contact among speakers of different languages has led to the development of similar structures independent of genetic relations. The languages may be related to one another, but as a group those in a linguistic area (or Sprachbund "language union") show special properties not attributable to common inheritance. This can there be an issue in determining whether languages are truly related.

A famous example is the Balkan linguistic area, mainly including such languages as Bulgarian, Macedonian, Albanian, and Romanian as well as (less centrally) Greek, Serbo-Croatian, and Romani (Gypsy). Most of these languages are from distinct branches of Indo-European, and their relatives outside this area (especially the other Slavic and Romance languages) fail to show the Balkan characteristics.

We'll look at a few examples of Balkan linguistic characteristics (Balkanisms), especially in syntax, before examining similar phenomena in the Americas.

See a more complex map of ethnic groups (often defined by language) in the Balkans, found at this site.

Periphrastic future

Many languages have a morphological future expressed using a suffix, in a way similar to the past tense, as in these Latin examples.

"I prepared"	par-āv-ī
"I will prepare"	par-āb-ō

In the Balkans, however, we find a periphrastic pattern, where a separate word or other element combines with a verb to yield this meaning. Just as in English will, the usage derives from the meaning "want".

Language	Formation	"I will see"
Albanian (Tosk dialect)	do (invariant) + subjunctive	Do të shikoj
Albanian (Gheg dialect)	kam (conjugated) + me + verbal noun	Kam me shikue
Greek	θa (invariant) + subjunctive	θa ðo / vlépo
Bulgarian	šte (invariant) + present tense	Šte vidja
Macedonian	kje (invariant) + present tense	Kje vidam
Serbo-Croatian (literary)	xteti (conjugated) + infinitive	ja ću videti
Serbo-Croatian (colloquial)	xteti (conjugated) + subjunctive	ja ću da vidim
Romanian (standard)	voi (conjugated) + infinitive	Voi vedea
Romanian (colloquial)	o (invariant) + subjunctive	O să văd
Romani (Erli)	ka (invariant) + subjunctive	Ka dikhav

The two Greek forms are "I will see / be seeing".

Avoidance of infinitives

It is common in many languages to use an infinitive or bare form of a verb in subordinate contexts, although the details may vary (e.g. French doesn't use the infinitive when the subject differs, but does use it with more verbs than English does, as in Je crois avoir fini "I believe (that I) have finished").

I write		J'écris
I want to write	*I want that I write	Je veux écrire	*Je veux que j'écrive
I want you to write	*I want that you write	*Je te veux écrire	Je veux que tu écrives

In the Balkans, however, we find the forms without the infinitive even with equi-subject, as in "I want that I write".

Language	Example	Notes
Albanian	Dua të shkruaj
Macedonian	Sakam da pišyvam
Bulgarian	Iskam da piša
Modern Greek	θélo na ɣráfo	cf. Ancient Greek βούλομαι γράψαι
Romanian	Vreau să scriu	cf. Vreau a scrie, which is also correct, but rarely used
Serbo-Croatian	Želim da pišem	cf. literary Želim pisati

Postposed articles

The definite articles in the Balkans are usually postposed, i.e. placed after the noun, even though the elements may derive from words that were originally placed before the noun (as in the Romanian article).

Language	Feminine		Masculine
Albanian	shtëpi	shtëpi-a	qiell	qiell-i
Macedonian	žena	žena-ta	maž	maž-ot
Romanian	muiere	muiere-a	bărbat	bărbat-ul

This is not true of every Balkan language; for example, Greek does not follow this pattern.

Mesoamerica

The Mesoamerican linguistic area is a famous and widely discussed example in the Americas. This includes a variety of linguistic families:

Mayan
Mixe-Zoque
Oto-Manguean
Uto-Aztecan
Totonacan

For Uto-Aztecan especially, which extends all the way to Oregon, the northern languages do not share these traits. There are also smaller families or isolates not given in this list, such as Tequistlatec cited below.

This linguistic area corresponds to a more general culture area with much shared history, which of course is the means by which linguistic traits undergo diffusion.

Body-part locatives

The use of body-part expressions that we've mentioned before, such as "belly" for "in", is widespread in this area.

Mixtec (Oto-Manguean)		Kaqchikel (Mayan)		Nahuatl (Uto-Aztecan)
čihi	stomach; in(side), under	-pan	stomach; in, inside	ijti-	belly; in
šinī	back; behind	-ix	back; behind	tepotz-	shoulder; behind
nuu	face; to, at, from	-wi	head-hair; on, on top of	īx-	face; in front of
ini	heart; in, inside	-či	mouth; to, in, at	nacas-	ear; beside

These words are not related — nor are the languages — but the metaphorical basis and the syntactic context have diffused across language boundaries.

Mixtec, Kaqchikel: Campbell et al. (1986: 549); Nahuatl: Stolz & Stolz (2001: 1544)

Vigesimal numerals in Tequistlatec

While decimal numerals as in English are common (due to the fact that we have 10 fingers), other languages use base-5 (i.e. the fingers on one hand) or base-20 (i.e. fingers and toes together). Base 20, or vigesimal, systems are found throughout the Mesoamerican area. The word for 20 often involves the word "man" (i.e. a person's worth of digits). These examples are from Tequistlatec of Oaxaca, Mexico, a small isolate group (sometimes classified as Hokan).

1	anuli	20	anu-šans	= 1 man
2	ogeʔ	30	anušans gimbamaʔ	= 20 + 10
3	afanéʔ	40	ogeʔ nušans	= 2 · 20
4	amalbuʔ	50	ogeʔ nušans gimbamaʔ	= 2 · 20 + 10
5	amageʔ	60	afaneʔ nušans	= 3 · 20
6	agamtsʼús	80	amalbuʔ nušans	= 4 · 20

The important thing to note is that 30, 40, etc. are not constructed from 3, 4, etc. Some such systems may include elements that are quinary or decimal as well; thus one might find 6=5+1 or 12=10+2 alongside 40=2·20.

The vigesimal pattern is ancient; numerals in the Mayan writing system (shown at right) were constructed from symbols that stand for 1 or 5, without a special symbol for 10; the cycle went from 0–19 and then restarted.

The Celtic languages also had vigesimal counting, and this is the source of the standard French numerals soixante-dix (sixty-ten) for 70, quatre-vingt (four-twenty) for 80, and quatre-vingt-dix for 90. Some dialects have decimal numerals here, for example Swiss French septante, huitante/octante, nonante (based on 7, 8, 9).

Campbell et al. (1986: 546)

North American culture areas

Linguistic areas reflect long-term, intimate contact among speakers of different languages, often with multilingualism by trade or intermarriage. Such contact yields shared properties in many aspects of culture.

The main culture areas in North America are shown in this map; these areas, or subparts of them, often constitute linguistic areas as well.

Northwest Coast obstruents

The Northwest Coast is famous for its large inventories of complex sounds. For example, here are the consonants of Nootka. See also a map with languages named.

		Bilabial	Alveolar		Palatal	Velar		Uvular		Pharyngeal	Glottal
		Bilabial	central	lateral	Palatal	plain	labial	plain	labial	Pharyngeal	Glottal
Nasal	plain	m	n
Nasal	glottalized	ˀm	ˀn
Stop	plain	p	t			k	kʷ	q	qʷ		ʔ
Stop	ejective	pʼ	tʼ			kʼ	kʼʷ
Affricate	plain		ts	tɬ	tʃ
Affricate	ejective		tsʼ	tɬʼ	tʃʼ
Fricative			s	ɬ	ʃ	x	xʷ	χ	χʷ	ħ	h
Approximant	plain				j		w			ʕ
Approximant	glottalized				ˀj		ˀw

At the same time, certain sounds (such as /r/, and sometimes most labials) are absent from these languages. These patterns are reflected in the several different linguistic families represented in the region — Salish, Wakashan, Na-Dené, Penutian (or its subparts).

Bella Coola phonotactics

The Bella Coola language of British Columbia is a good example of a language that permits long strings of obstruents. It's not clear how these strings should be analyzed in terms of syllables; many analyses have been proposed.

"northeast wind"		sps
"seal fat"		sxs
"crooked"		qʷt
"that's my animal fat over there"		sc̓qctx
"bunchberry"		p̓xʷɬt
"he had had in his possession a bunchberry plant"		xłp̓x̣ʷłtłpłłskʷc̓

Alsea phonotactics

Alsea, located toward the southern end of this culture area, also permits complex strings of consonants (obstruents and sonorants), although not so impressive as Bella Coola.

"push him!"		cxʷt-t
"get it with the pole!"		x̣lt̕-t
"is melting it"		slx̣ʷ-tx̣
"made it straight"		cɬayq-tx̣
"was being overtaken"		cqʷanqʷ-ɬn-x̣
"hit him!"		ɬ-mk̓in-tx̣-t
"torrents of rain"		ɬ-m-ɬalx̣ʷs-x̣mt

Morpheme boundaries are included here. The longer words in Bella Coola are also morphologically complex; in fact, it has been claimed that a root morpheme in that language cannot contain more than 4 obstruents in a row (among other restrictions), as in /p̓xʷɬt/. "bunchberry".

California /t, ṭ/

A contrast in place of articulation between dental /t/ and alveolar /ṭ/ is common in California but hard to find elsewhere in North America. (It is common, however, in South Asia and Australia.) The contrast occurs in otherwise unrelated languages — shown in the map — indicating a case of areal diffusion. In addition, some languages include the affricate /c̣/ in contrast with just /t/ or both /t, ṭ/.

For example, Kashaya has this contrast in pairs of words such as these:

hotʰ "is not giving (a long object)"
hoṭʰ "warm"
nata:du "to weigh"
naṭa "child"
hoʔtʼo "head"
daʔṭʼoʔṭʼo "screech owl"

Langdon and Silver (1984) suggest that the distinction may have arisen from contact between languages with different realizations of the single /t/ in the respective languages; that is, some were dental and some were alveolar, and a contrast between the two may not originate in a single language but rather from the combination of languages. More specifically, the most common realization was probably alveolar but contact with dental languages could have led to the introduction of that segment. For example, some languages of the south have an allophonic alternation between the two places of articulation, which might have been reinterpreted by speakers of other languages.

Alternatively, Catherine Callaghan reconstructs the contrast for the Miwok-Costonoan family and suggests it may have spread from there to other languages.

Types of sibilants

Sounds that are similar to /s/ may have different realizations, and multiple s-like sounds (sibilants) may be contrastive in some languages.

Non-retracted [s]		Latin American Spanish /s/		Mandarin /s/ e.g. sān "three"		Old French c(i,e) cf. cent
Retracted [ṣ]		Castilian Spanish /s/		Mandarin /ʂ/ e.g. shān "mountain"		Old French s cf. pousse > push

Borrowings of some words with Old French s into Middle English with sh suggest the difference in pronunciation indicated here (other examples are the many words from French -ir verbs, stems in -iss-, that have English -ish: finish, polish, establish, etc.). However, many examples of Old French s were nevertheless borrowed into English as /s/, and certainly we read c(e,i) with /s/ in borrowings such as cent. Several possible reasons: the actual Old French sound was intermediate between English /s/ and /š/; the orthography played a role; or not all dialects of Old French had the same (or any) distinction.

California /s, ṣ/

An issue in California reminiscent of the /t, ṭ/ issue is the plain and retracted sibilants. As Bright (1978) discusses, many English-speaking field-workers in California and some nearby areas have had trouble categorizing the sibilant sounds that they heard in native languages, often vacillating between transcriptions such as [s, ṣ, š]. In some of the languages, there may be only one /s/ phoneme with various realizations, while in others there are two or three sibilants. The main point of confusion has been between lamino-alveolar [s] and apico-alveolar [ṣ]; these can be called plain and retracted. It has been observed that speakers of English use both of these articulations for /s/, although the apical version is not so retracted that the acoustic difference is striking. When [ṣ] is sufficiently retracted, however, one can hear the difference fairly easily.

The major conclusion that Bright reaches is that :

while in Europe one typically considers non-retracted [s] to be the more basic or unmarked sound,

in California it appears instead that retracted [ṣ] is the more common and less marked of the two.

This point is illustrated on the next few pages by means of maps that show the languages with various contrasting sibilants.

Languages with just one sibilant

Among these languages, the usual realization of the sibilant is not [s], but rather most often [ṣ]; the retracted articulation may be a possible alternant even when [s] is present and more common.

Languages with /s/ versus /š/

In a few languages, non-retracted /s/ contrasts with /š/, similar to the situation in many European languages.

Languages with /ṣ/ versus /š/

But considerably more languages contrast retracted /ṣ/ with /š/. It has been suggested that this is less common around the world because these sounds are more similar acoustically, yet it is common in this region.

Languages with /s/ versus /ṣ/

Finally, the two kinds of /s/ constrast directly in a number of the languages — the most common pattern of all. In some cases one of these sounds alternates noncontrastively with another fricative, such as [š] or [θ].

The widespread existence of this unusual contrast is a signal feature of California. Notably, it is not correlated with specific genetic groupings, and therefore must reflect diffusion within a linguistic area.

Languages with all three

A few languages go so far as to contrast all three sibilants /s, ṣ, š/, but you can see that this is not common.

Diffusion out of California

These maps identify languages nearby that have similar sibilant systems — for most of these, there is one sibilant that is retracted or alternates with the non-retracted type. These languages have no close relation with most of those shown in California, but English-speaking field workers had similar problems in nailing down the contrasts.

For example, early transcriptions of Alsea generally use the letter c, which at the time stood for /ʃ/, but also sometimes also use s; later transcriptions are standardized to s, and there was no contrast.