Monday, December 10, 2007

Dead Hand of the Comparative Method - 1

I may sound like a linguist, and even flatter myself occasionally that I'm learning and aspiring to be one, but I try to keep my head clear, as much as possible, from the complex phonological and grammatical technicalities that seem to be the meat of so many current An linguistics papers:
- The origin of the Kelabit voiced aspirates: a historical hypothesis revisited.
- On the origin of Philippine vowel grades

- The pronoun system in Galeya: arguments against a clitic analysis.
Oceanic Linguistics Dec 2006
Of course, these phonological and grammatical details are essential in the recording and study of existing languages, as a whole. They are extremely useful in differentiating between, and grouping, related languages.

But they are simply irrelevant in studying the typology and semantics of number words and number-sets, en masse, as I am attempting to do. (Later on, if I ever get there, I may use phonology to resolve the very ends of the twigs, or, if it's worthwhile, to analyse the mass of Western Malayo-Polynesian number-sets which look, very boringly, almost identical. It is essential to know the inherited sounds of a particular language to be able to help distinguish between inherited words ('reflexes') and borrowed words).

I do feel that the analytical power of phonology is greatly over-estimated in the special field of historical linguistics, and that over-reliance on it can lead to a very misleading results.

A statement like this, from the doyen of An linguists, who kindly sent me the numerals chapter of his forthcoming book:
"Although this may seem like a drastic departure from the decimal system that these languages inherited from a remote common ancestor, even more drastic innovations in numeral systems are found in some AN languages of New Guinea, where they clearly reflect Papuan contact influence."

makes me go quietly ballistic.

It completely reverses the 'normal' hypothesis that a less-developed number system will evolve into a more-developed one. It states that, somehow, all the less-developed and 'irregular' number systems in Austronesian languages (more than half of them) are 'innovations' from a 'pure' An ancestor.

[And, so far, I have not found a single instance where an An number system in New Guinea can be shown to 'clearly reflect Papuan contact influence']. If anything, it seems to have been quite the opposite.

It's easy to see how this happened: the WMP (Western Malayo-Polynesian, once called Indonesian) languages (from which PAn was originally derived) are fairly well homogenised, and all have a full decimal system (with only two minor exceptions). The majority of them use recognisably similar words for those numerals.

- Therefore, so does PAn (proto-Austronesian), the reconstructed ancestor of all Austronesian languages. You sift through those words, just like you'd pan some gravel, end up with some glittering cognates, and reconstruct the familiar 'proto-Austronesian':
*esa/isa, *duSa, *telu, *Sepat, *lima, *enem, *pitu, *walu, *Siwa, *sa-puluq

- Therefore, POc (proto-Oceanic) must also inherit this system, because it also has to demonstrate descendence from PAn.

- Therefore, if you reconstruct a proto-lexicon by sieving through a gradually-reducing mesh of current cognate words you will end up with a 'proto numeral' lexicon that mirrors the majority of current vocabularies, even if that particular lexicon set has been relatively recently introduced, and become very rapidly widespread. If that numeral lexicon also implies a fully-developed number system, you've allowed yourself to be led right down the plughole.

From there on, you must consider anything else as a deviant numerical practice, or innovation (probably brought about by miscegenation with fuzzy-wuzzies).

Therefore, most Melanesians and Polynesians are raving numerical deviants.

The limitations of the Comparative Method are revealed (between the lines) by its greatest current exponent, Robert Blust:

Historical linguistics depends for its results on two fundamental and by now well-tested claims about the nature of language: (1) The relationship between sound and meaning is largely arbitrary, and (2) sound change is largely regular. The first of these claims was first clearly enunciated by Saussure (1959), and the second by various of the Neogrammarians during the last three decades of the nineteenth century. Both have been challenged in various ways, but both remain as pillars of linguistic method.

Like everything in Nature, language changes. In time words come to differ in shape and perhaps also in meaning. Since sound change is regular, the differences in the sound shape of words are systematic, and permit the original forms to be reconstituted with a rather high degree of confidence. The procedures followed in such reconstitution of prehistoric forms are collectively known as the Comparative Method. Where we have documentary checks, as in comparing the modern Romance languages with their immediate common ancestor, Latin, we are encouraged that even in the absence of documentary support our results will not ordinarily go far wrong.
The application of the Comparative Method to related (cognate) words by a process of triangulation results in a reconstruction of the sound system and vocabulary of an earlier language, called a proto-language.

To illustrate with three simple examples, Malay langit, Samoan langi, Hawaiian lani "sky", Malay tangis, Samoan tangi, Hawaiian kani "weep"; and Malay mata, Samoan mata, Hawaiian maka "eye" show recurrent correspondences of sound in words of related meaning, and so are assumed to derive from (reflect) a common ancestral form in each case, conventionally preceded by an asterisk to show that it is based on inference, not on observation.
For our purposes here (leaving out information that can be supplied only by the aboriginal languages of Taiwan), these forms can be reconstructed as *langit "sky", *tangis "weep" and *mata "eye".

Robert Blust
The Prehistory of the Austronesian-Speaking Peoples: A View from Language
Journal of World Prehistory, Vol. 9, No. 4, 1995

Having reconstructed a proto-language, you can then then propose it as the root of a family tree, and trace back the branches to separate the modern, existing languages into their bunches, or groups, each bunch of twigs descended from one node on a major branch.

You end up with a family tree that looks like this:
(Click on the picture for a larger version)

You´ll soon notice a few peculiarities:
1) The family tree is upside down. This is only one of linguistics' weird idosyncrasies, where 'reflect' means derive from, 'innovation' can mean 'reversion', etc.

2) The tree is heavily weighted towards the right, ie towards Oceanic, with a proto-language featured at each node. On the other major branches:

Formosan languages
Western Malayo-Polynesian

There are no proto-languages at the major nodes at all.
Each of those major groups has proved impossible to reduce down to an ancestral proto-language, so far.
Which is a great pity, since it implies that around the majority of all current Austronesian-speakers speak an orphan language, or at least one whose immediate parental identity and location are in doubt.

But you've got a neat map, showing the distribution of language groups.