If you've ever cringed when your parents said "groovy," you'll know that spoken language can have a brief shelf life. But frequently used words can persist for generations, even millennia, and similar sounds and meanings often turn up in very different languages. The existence of these shared words, or cognates, has led some linguists to suggest that seemingly unrelated language families can be traced back to a common ancestor. Now, a new statistical approach suggests that peoples from Alaska to Europe may share a linguistic forebear dating as far back as the end of the Ice Age, about 15,000 years ago.
"Historical linguists study language evolution using cognates the way biologists use genes," explains Mark Pagel, an evolutionary theorist at the University of Reading in the United Kingdom. For example, although about 50% of French and English words derive from a common ancestor (like "mere" and "mother," for example), with English and German the rate is closer to 70%—indicating that while all three languages are related, English and German have a more recent common ancestor. In the same vein, while humans, chimpanzees, and gorillas have common genes, the fact that humans share almost 99% of their DNA with chimps suggests that these two primate lineages split apart more recently.
Because words don't have DNA, researchers use cognates found in different languages today to reconstruct the ancestral "protowords." Historical linguists have observed that over time, the sounds of words tend to change in regular patterns. For example, the p sound frequently changes to f, and the t sound to th—suggesting that the Latin word pater is, well, the father of the English word father. Linguists use these known rules to work backward in time, making a best guess at how the protoword sounded. They also track the rate at which words change. Using these phylogenetic principles, some researchers have dated many common words as far back as 9000 years ago. The ancestral language known as Proto-Indo-European, for example, gave rise to languages including Hindi, Russian, French, English, and Gaelic.
Some researchers, including Pagel, believe that the world's languages are united by even older superfamilies, but this view is hotly contested. Skeptics feel that even if language families were related, words suffer from too much erosion, both in terms of sound and meaning, to be reliably traced back further than 9000 or 10,000 year, and that the similarities of many cognates may be pure chance. What was missing, Pagel says, was an objective method of analysis.
Pagel and his co-workers took a first step by building a statistical model based on Indo-European cognates. Incorporating only the frequency of a word's use and its part of speech (noun, verb, numeral, etc.)—and ignoring its sound— the model could predict how long the word persisted through time. Reporting in Nature in 2007, they found that most words have about a 50% chance of being replaced by a completely different word every 2000 to 4000 years. Thus the Proto-Indo-European wata, winding its way through wasser in German, water in English, and voda in Russian, became eau in French. But some words, including I, you, here, how, not, and two, are replaced only once every 10,000 or even 20,000 years.
The new study, appearing today in the Proceedings of the National Academy of Sciences, makes an even bolder statement. The researchers broadened the hunt to cognates from seven major language families, including Indo-European, Eskimo, Altaic (comprising many Oriental languages), and Chukchi-Kamchatkan (a group of non-Russian languages around Siberia), which have been proposed to form an ancient superfamily dubbed Eurasiatic. Again, using only the word's frequency and part of speech, the model successfully predicted that a core group of about 23 very common words, used about once per 1000 words in everyday speech, not only persists within each language group, but also sounds similar to the corresponding words in other families. The word thou, for example, has similar sound and meaning among all seven language families. Cognates include te or tu in Indo-European languages,t`i in proto-Altaic, and turi in proto-Chukchi-Kamchatkan. The words not, that, we, who, andgive were cognates in five families, and nouns and verbs including mother, hand, fire, ashes, worm, hear, and pull, were shared by four. Going by the rate of change of these cognates, the model suggests that these words have remained in a similar form since about 14,500 years ago, thus supporting the existence of an ancient Eurasiatic language and its now far-flung descendants.
"The model hints at a group of people living somewhere in Southern Europe as the glaciers were receding, speaking a language that might resemble those spoken today," Pagel says. "It's astonishing that spoken language can be transmitted through millennia with enough fidelity to give us information about our early history."
Whether the findings will sway the skeptics is another question, according to William Croft, a linguist at the University of New Mexico, Albuquerque. The use of methods from evolutionary biology makes the Eurasiatic superfamily more plausible, says Croft, who is more sympathetic than many to the idea. "It probably won't convince most historical linguists to accept the Eurasiatic hypothesis, but their resistance may soften somewhat."