The world of academic publishing is an oligarchy. Not only are the vast majority of highly cited papers authored by an elite 1% of scientists, but a small group of elite journals also get the lion's share of citations and media attention. But this rarified world is becoming more egalitarian, according to a study released 9 October by the team that develops Google Scholar, the free literature search engine now used by virtually every scientist in the world. The study is the strongest evidence yet that the dominance of the elite journals is eroding, thanks in part to how much easier it has become for scientists to find and cite obscure but relevant papers.
As recently as the 1990s, most scientists found each other's work by cracking open a journal that their university subscribed to and reading the articles in print. But even with speed-reading, humans just can't read fast enough to explore more than a tiny portion of the more than 1 million academic papers published every year. The digitization of journals has allowed computers to do the searching for us.
To mark their 10th anniversary next month, the Google Scholar team is taking a short break from building and maintaining their scholarly search engine. "We wanted to take a look back and see how things have changed," says Anurag Acharya, a computer scientist who co-founded the project at Google in 2004.
The team has a massive data set to explore, encompassing about 160 million documents according to one estimate, although the exact size of its corpus is not public information. The data include not just most journal articles ever published but also just about anything deemed as "scholarly" that has been digitized and put online: Ph.D. theses, books, patents, and even conference posters. Because all of this material is harvested automatically by Google's Web-crawling software, researchers have discovered that Google Scholar's algorithms can be gamed to artificially boost citations. But so far, no high-profile exploitations have been reported.
To see how academic publishing has evolved, Acharya's team analyzed the journals and articles within each of the search engine's 261 subject areas, which fall within nine broad fields, for each year from 1995 through 2013. Then, they created a ranking of journals based on how frequently their articles are cited. They defined the journals in the top 10 as "elite" and the rest as "non-elite." Then they ranked all the papers in each area based on citations, regardless of the journal in which they were published.
Elite journals like Science and Nature are still on top of the heap, they found, but lower ranked journals in many fields have been making gains, the team reports in a paper posted to arXiv titled "Rise of the Rest: The Growing Impact of Non-Elite Journals." In 1995, only 27% of citations pointed to articles published in nonelite journals. That portion grew to 47% by 2013. And the nonelite journals published an increasing share of the most highly cited papers within each field as well, growing from 14% to 24%. The most dramatic egalitarian trends were in the areas of Computer Science, with a 133% increase in citations to nonelite journal articles, and Physics & Mathematics, with the fraction of most cited papers in nonelite journals more than tripling over the past 2 decades.
The results echo those of a study published online last year in the Journal of the Association for Information Science and Technology that analyzed citation data from Thomson Reuters’ Web of Science. "I am pretty happy to see them corroborated using another data source," says Vincent Larivière, the lead author of that study and a library scientist at the University of Montreal in Canada. In an e-mail to ScienceInsider, Larivière noted that "the coverage of Google Scholar is much (much) larger" than Web of Science and other competing academic search engines, so the results could have been wildly different. The egalitarian trend "is a pretty strong phenomenon, and independent from the data sources," he says.
"Their explanation for the results is definitely sound," says Hadas Shema, a library scientist at Bar-Ilan University in Ramat Gan, Israel, and an expert on journal jockeying. But she notes that some of the trends may have more mundane explanations. For example, "The large jump (204%) in citations for non-elite journals in Physics & Mathematics probably has to do with the establishment of the ArXiv repository in 1991," she says. "The custom of depositing a free copy in ArXiv before 'official' publication have made the publication in a journal more of a formality." And the growth of open-access journals and open-access article repositories may also explain some of the leveling, she adds. The Google Scholar study "didn’t differentiate between different types of documents," she notes, instead lumping everything together.
*Correction, 15 October, 11:04 a.m.: The fraction of most cited physics and mathematics papers in nonelite journals has more than tripled, not more than doubled, in the past 2 decades.