The origin of Indo-European Languages
Science has just published an article by 9 people in 13 academic departments across the world on the geographical origin of Indo-European languages:
There are two competing hypotheses for the origin of the Indo-European language family. The conventional view places the homeland in the Pontic steppes about 6000 years ago. An alternative hypothesis claims that the languages spread from Anatolia with the expansion of farming 8000 to 9500 years ago. We used Bayesian phylogeographic approaches, together with basic vocabulary data from 103 ancient and contemporary Indo-European languages, to explicitly model the expansion of the family and test these hypotheses. We found decisive support for an Anatolian origin over a steppe origin. Both the inferred timing and root location of the Indo-European language trees fit with an agricultural expansion from Anatolia beginning 8000 to 9500 years ago. These results highlight the critical role that phylogeographic inference can play in resolving debates about human prehistory.
The non-technical version comes from the website of University of Auckland, which is home to the corresponding author, Quentin Atkinson:
In this paper we identify the homeland of the Indo-European language family by adapting ‘phylogeographic’ methods initially developed by epidemiologists to trace the origins of virus outbreaks. Instead of comparing viruses, we compare languages and instead of DNA, we look for shared cognates – words that have a common origin, such as “mother,” “mutter” and “madre” – across various Indo-European languages. We use the cognates to infer a family tree of the languages and, together with information about the location of each language, we trace back through time to infer the location at the root of the tree – the origin of Indo-European.
A perspective article in Science explains:
Charles Darwin noted in 1871 that languages, like plants and animals, could be classed into related groups. Each language arose only once, in one place, and modern languages descended with modification from ancestral ones. The proofs of language and species evolution “are curiously parallel,” he wrote in The Descent of Man.
Atkinson and colleagues applied Darwin’s analogy to the Indo-European language family, which includes varied but related tongues such as English, Italian, Albanian, Persian, and Hindi. Linguists have long sought clues to the origins and spread of these languages by analyzing their vocabulary, sounds, and grammar, and by studying the archaeology of ancient migrations. Evidence marshalled by Renfrew in the 1980s suggested an Anatolian homeland, the same land from which the first farmers spread 8000 or 9000 years ago. But recent archaeological and linguistic data have pointed to an origin on the steppes north of the Black and Caspian seas, where seminomadic herders known as the Yamnaya expanded into Europe and Asia with domesticated horses and wheeled carts beginning perhaps 5000 years ago.
Enter Atkinson and a research team drawn largely from the fields of biology, computer science, and psychology. They focused on vocabulary, specifically the gain and loss of cognates, or words in related languages—such as “mother” in English and “mutter” in German—that stem from a single ancestral root. The team used cognates from other studies on 103 ancient and modern Indo-European languages. They considered this data set analogous to molecular sequence data, with the rate of cognate gain and loss akin to the rate of nucleotide substitution in viral evolution.
Atkinson’s team also added published data on the geographical ranges of all 103 languages, plus historical dates for language divergence, such as the breakup of the Roman Empire, which triggered the evolution of Romance languages from a type of Latin. The computer model then worked back in time, inferring possible ancestral relationships and patterns of diffusion, and generating possible homelands. Then the team compared “how often the origin locations fell into the range proposed for the Anatolian theory versus the Steppe theory,” Atkinson says. The Anatolian hypothesis won hands down.
The analogy to virus evolution is “clever and insightful,” says linguist Paul Heggarty of the Max Planck Institute for Evolutionary Anthropology in Leipzig, Germany, who thinks the study will prompt other linguists to try the method. He says the study will give supporters of the Steppe hypothesis “plenty of explaining to do.”
Other researchers, however, take strong issue with the findings. Anthony notes that Atkinson and his colleagues limited their study to vocabulary, just one of three subsets of linguistic data, “something you are really not supposed to do,” he says. In addition, the authors rooted their model in geography mainly using modern distributions of languages. “The results don’t tell you much about the past,” Anthony concludes.
The paper makes many inferences on matters such as the rates of language change and how languages diffuse, says Victor Mair, a Chinese language expert at the University of Pennsylvania. “There is so much about this paper that is arbitrary,” he says. By comparison, he says, the Steppe hypothesis “is based heavily on archaeological data such as burial patterns, which are directly tied to datable materials.”
The spread of a language does not necessarily mean the spread of a set of people. The expansion of English into China is one example of this.