The Ancient, Modern, and Future Language of “Dog.”

Part 1. The Ancient. Wherein the Author describes the Border War between Linguists on the history of the proto-word for “Dog.”

Part 2. The Modern. Wherein the Author describes Dog’s omnipresence in modern language.

Part 3. The Future. Wherein the Author describes Dog’s presence in the babble and first words of children.

I ran across a border war today while reading up on an article I will discuss in the third part of this series. Like many of the conflicts that pique my curiosity, this one has a dog at its center. You can tell that an issue is likely a border war when you search for a topic (in this case “global etymologies”) on google and the first page is filled with rants against the fundamental idea instead of links to the original content.

Anthropologists study the history of human groups and migrations by examining the common genetic elements of those groups, searching for the most recent common ancestors (hypothetical ‘Eve’s). Interdisciplinary historical linguists study the history, migration, and interaction of language (and thus people) by comparing common sounds and word meanings between languages, and in doing so classify language families and construct proto-languages. The mother of all such languages is called the Proto-World Language.

I suspect that the Proto-World Language is like the Holy Grail for historical linguists. It’s more of a guiding concept than a reality. Many don’t believe it can be found, others don’t believe that it ever existed in the first place, and anyone who turns up a clue or a possible path is resoundingly attacked by everyone else. Some attack because they are atheistic to the idea, others because they too are on the hunt and another’s success is their failure.

Like many border wars, this one seems to fall in the “techie vs. fuzzy” mold, although to an outsider the differences between the two groups seem trivial, making the narcissism of minor differences a distinct possibility. The fuzzy linguists want to tell a good story, bask in romantic histories, ask how the languages feel about each other, and do it this way because they’ve always done it this way; this is the comparative method.

The Sanskrit language, whatever be its antiquity, is of a wonderful structure; more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either, yet bearing to both of them a stronger affinity, both in the roots of verbs and the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists.
Sir William Jones 1786; Quoted by Lehman 1967 and Szemerenyi 1996:4

The techie linguists aren’t scared of letting numbers tell the story, of using new techniques to inform old debates, using old techniques with new tools, or of looking at data before they reach conclusions instead of telling a story and then looking for “data.” Their method of choice is called multilateral or mass lexical comparison. The obvious criticism from the fuzzies is that numbers can confuse coincidence for correlation, but any techies know that correlation does not imply causation. The fuzzies say that random coincidence is “quite high” although without doing an analysis of the expected random coincidence vs. the observed random coincidence (decidedly techie), I don’t know what basis the fuzzies have to state that such coincidence is tainting the techie’s results. The mass lexical comparison method seems pretty straight forward and sound to me:

If then, we find a mass of resemblances between different languages, resemblances that are not onomatopoetic in nature and do not appear to be borrowings, we must conclude that the similarities are the result of a common origin, followed by a descent with modification in the daughter languages.
J.D. Bengtson and M. Ruhlen, On the Origin of Languages: Studies in Linguistic Taxonomy. Stanford: Stanford University Press, 1994, p. 43.

In my usual interdisciplinary stance, I figure there is room for both methods and perhaps the two methods can inform each other. Not too many people agree.

Needless to say, the fuzzy linguists have launched a full scale border war on the techie linguists (numbers lie and are scary, scientific fan-fiction makes you feel good and what feels good must be true). Consistent with my Confederacy of Dunces theory, where the sound and fury from the establishment against a new and provocative idea is entirely inconsistent with the weight of the idea, this border war features a preemptive strike by the comparative f
uzzies. The old school linguists actually published an anti-global etymologies paper
(Joseph Salmons, 92) two years before the original global etymologies paper (Bengtson & Ruhlen, 94) was even published.

The essential argument in the Language Log article is research that the group-think fuzzies don’t agree with shouldn’t even be published, because that’s the purpose of “peer-review:” to enforce group think. I especially like the hypocrisy where the author complains that because the referees are anonymous, there can’t be a “public debate” (read: mob lynching) to force them to censor unpopular views (read: antithesis of public debate). The author’s criticism that peer reviewers are unqualified to judge outside of their specialty is code speak for ‘they haven’t been indoctrinated into enforcing the group-think.’

Doctor Merritt Ruhlen and Linguistic expert John Bengtson fall in the techie group and are the target of the above article because they used technical analysis to discern a list of 27 “global etymologies.” These etymologies, also known as cognates, are similar words in different languages that are likely to have a common origin. Critics (read: confederacy of dunces) incorrectly classify these global etymologies as “reconstructions” of the Proto-World Language, and thus they have doubly attacked Ruhlen and Bengtson and any other linguists or anthropologists who use their work. But Ruhlen and Bengtson don’t make that claim and explicitly state so:

For each etymology…we present a phonetic and semantic gloss, followed by examples from different language families. …We do not deal here with reconstruction, and these glosses are intended merely to characterize the most general meaning and phonological shape of each root. Future work on reconstruction will no doubt discover cases where the most widespread meaning or shape was not original.
J.D. Bengtson and M. Ruhlen, On the Origin of Languages: Studies in Linguistic Taxonomy. Stanford: Stanford University Press, 1994, p. 14 note 3.

You’ll notice that Dr. William Poser uses “reconstruct*” no less than 25 times in his criticism. Too bad he didn’t make it to 27, it would have provided some nice symmetry to the 27 cognates Bengtson and Ruhlen unearthed. One linguist injected 25 mistaken words into his analysis because he wanted them there, two linguists arrived at 27 words because their technique spat them out. The problem with Dr. Poser is that he doesn’t have an excuse to fall so firmly into the fuzzy mindset, he studied Electrical Engineering along with Classics and Linguistics.

That’s the difference I see here: Bengtson and Ruhlen developed a method (the black box), and then took data and ran it through the box and waited to see what came out the bottom. Criticize the box all you want, substitute your own, but what drops out the bottom is governed only by the rules in the box and those rules are clear and explicit and easy to construct without bias. The fuzzy linguists don’t develop a method and then run data through it, they massage the method until the results make “sense” and tell a story. They don’t let an unbiased method put words together, they find a reason that two words they select make sense together. Untold degrees of intentional and unintentional bias infects the input data and the results when you try and make them say something you understand instead of trying to understand what they are really saying.

And what are those 27 threatening words? The source of the bitter bickering and posturing? The results of the black box and possible links to the holy grail of all languages?

Bengtson and Ruhlen’s 27 Global Etymologies
  1. AJA – mother, older female relative
  2. BU(N)KA – knee, to bend
  3. BUR – ashes, dust
  4. CHUN(G) – A nose, to smell
  5. KAMA – hold (in the hand)
  6. KANO – arm
  7. KATI – bone
  8. K’OLO – hole
  9. KUAN – dog
  10. KU(N) – who?
  11. KUNA – woman
  12. MAKO – child
  13. MALIQ’A – to suck(le), nurse, breast
  14. MANA – to stay (in a place)
  15. MANO – man
  16. MENA – to think (about)
  17. MI(N) – what?
  18. PAL – the number two
  19. PAR – to fly
  20. POKO – arm
  21. PUTI – vulva
  22. TEKU – leg, foot
  23. TIK – finger, the number one
  24. TIKA – earth
  25. TSAKU – leg, foot
  26. TSUMA – hair
  27. ?AQ’WA – water (Question mark denotes a glottal stop.)

You can tell one hell of a (R rated) story with those 27 words, and perhaps that’s why they are so interesting. They aren’t from a fuzzy story telling method, but a techie method. And before you think all of us techies still live in our Aja‘s basement, have tsuma on our backs and puti on the brain, jealous of prehistoric mano running around kamaing a large dinosaur kata, trying to get to pal base with kunas by bashing them on the head and dragging them back to our k’olo by their tsaku, we’re not. Some of us do shave our backs.

Now Dr. Poser isn’t all bad. He gives a very nice explanation of the ancient words for dog:

Although sound change is the main way in which words change over time, it is also possible for a word to be replaced by an entirely different word. For example, the Proto-Indo-European word for “dog” was something like *kuon. (The star indicates that this is a hypothetical form.) We reconstruct this form from attested (actually recorded) forms like Greek kuon, Sanskrit shvan, and German hund by asking what proto-form would yield the attested forms after undergoing the sound changes observed in the various languages, and also taking into account changes in word-formation. The direct descendant of this word in English is hound. But at some point the common Germanic word for “dog” took on a more specialized meaning and was replaced, as the general term, by dog, a word whose origin we do not know.

This fits nicely with the attention Ruhlen and Bengtson gave to dog in their work:

9. *KUAN—’dog’ — canine; cynic; hound; !Kung /gwi ‘hyena’; Proto-Afro-Asiatic *k(y)n ‘dog, wolf’; Proto-Indo-European *kwon- ‘dog’ > Sanskrit s’van, Phrygian kan, Latin canis, Greek kuon, Germanic hund; Proto-Uralic *küinä ‘wolf’; Old Turkish qanchiq ‘bitch’; Monglian qani ‘wild dog’; Proto-Tungus-Manchu *khina ‘dog’; Korean ka ‘dog’ (< kani); Gilyak kan ‘dog’; Chinese kou ‘dog’ (khjwen); Tibetan khyi ‘dog’; Proto-Oceanic *nkaun ‘dog’; Taos kwiane-, Tewa tukhwana ‘fox, coyote’

If we really could assign dates to the mutations in DNA and the changes in our language, we just might find that dogs became biologically distinguished from wolves at about the same time man used one word to describe a wolf and another to describe a domesticated dog.

