New Guinea is one of the most linguistically diverse places in the world, with more than 1000 distinct languages crammed into an area not much larger than the state of Texas. Despite this rich variety—for comparison, Europe contains about 280 languages—linguists have only analyzed the grammatical structures of a fraction of the South Pacific island’s languages. Now, Simon Greenhill, a linguist at Australian National University in Canberra, is trying to remedy that situation, by gathering together hundreds of thousands of words from published surveys, book chapters, and articles, as well as the accounts of early European explorers, and putting them into an online database called TransNewGuinea.org. Updated daily, the site already contains glossaries for more than 1000 languages from 23 different language families, including 145,000 words. There are roughly 1000 different words for “water,” as well as for “louse,” and linguists and language enthusiasts can view all the languages by geographic origin in an interactive map. Greenhill introduced the scientific community to the site this week in the journal PLOS ONE; already, he has used the database to look for clues about how the different languages are related. Through comparative, historical, and computational analyses of the data, he hopes the linguistic community will now use the site to solve long-standing questions about how New Guinean populations expanded and spread their culture.
Click here for free access to our latest coronavirus/COVID-19 research, commentary, and news.
Support nonprofit science journalism
Science’s extensive COVID-19 coverage is free to all readers. To support our nonprofit science journalism, please make a tax-deductible gift today.