Read our COVID-19 research and news.

Preprints, such as those appearing on bioRxiv, have become a speedy way for scientists to share their results, but they haven’t been peer reviewed.

Do preprints improve with peer review? A little, one study suggests

Every day, scientists post dozens of preprints—studies that have not been peer reviewed—on public servers such as bioRxiv. Preprints allow scientists access to cutting-edge findings faster than when authors submit their findings to traditional journals, which often take months to complete reviews.

But what preprints gain in speed, they may lose in reliability and credibility, critics say, because peer review can finger mistakes and deficiencies. That’s a worry especially for findings about medical treatments that nonscientists might misinterpret, possibly at risk to their health. The coronavirus pandemic has only heightened those concerns.

But peer review doesn’t appear to give a big quality boost to preprints, a recent study concludes—at least by one measure. The study, itself a preprint posted 19 March on bioRxiv, compared 56 preprints posted on bioRxiv in 2016 with the peer-reviewed versions later published in journals. Many were studies of genetics and neuroscience. In particular, the researchers examined whether the final papers reported more key details of the research, such as the types of reagents and statistical methods used, than the preprint.

The bottom line: There was only modest improvement in this metric, which the researchers dubbed the quality of reporting. On average, they found that peer reviewers caught just one deficiency per manuscript in about 25 categories of reporting.

The study’s authors—at institutions in Brazil and several other countries—say their approach provides an objective, consistent measure of peer review’s effects. In contrast, evaluations of other features of manuscripts typically involve subjective judgments, which are harder to compare. But the authors concede that their measure of transparency doesn’t show whether referees improved the manuscripts in other important ways, such as whether methods were sound and evidence-supported conclusions.

“We don’t have a methodological consensus on how to measure the quality of a paper and therefore how to best evaluate peer review,” says Steven Goodman of Stanford University, who has studied peer review but did not participate in the study. “There are many dimensions to it, and [quality of] reporting is just one. … It would be misleading to say this finding shows the effect of peer review in science, writ large.” Other evidence indicates peer review provides benefits, including less “spin” in authors’ conclusions and better reporting of limitations, he adds.

Still, Goodman calls the new paper “a meaningful contribution” because future research could build on its approach, and because such studies are rare. “It’s very tough work to do” because of a lack of funding, he says.

The study’s findings suggest peer reviewers could catch more deficiencies in reporting of methods and other details, says lead author Clarissa Carneiro, a doctoral student at the Federal University of Rio de Janeiro. The resulting improvement in transparency, she adds, could help other scholars reproduce the results. And the finding that the quality of reporting in preprints is largely equivalent to that of peer-reviewed articles supports scientists’ growing use of preprints in applications for jobs and grants, she says. “Preprints can be considered valid scientific output from a research project.”

The 56 preprints the researchers examined each presented at least one original result. In each manuscript, they selected the first result presented in a figure or table, then assigned it a quality score, using a checklist they derived from multiple published guidelines created for authors and journals. The list asked whether a clinical trial was blinded, for example.

The team found the mean score for quality of reporting was 68% for preprints, but rose to 72% for the peer-reviewed published version—a slight, though statistically significant, increase. More selective journals, as measured by journal impact factor, didn’t show bigger increases.

Several reasons may explain why the scores did not improve much. For example, the preprint authors might have taken care to include adequate reporting because they knew they would be submitting the paper for peer review, Carneiro says. (BioRxiv says two-thirds of its preprints are eventually published.) And journal reviewers might not have asked for improvements because many editors do not enforce guidelines on transparency, and reviewers may be unfamiliar with such guidelines or ignore them, says Goodman, who co-founded Stanford’s Meta-research Innovation Center, which examines the reproducibility of biomedical research.

The sample of preprints in Carneiro’s study may be too small to adequately represent the diversity of subdisciplines whose scientists post on bioRxiv, Goodman adds. But some support for Carneiro’s findings came in a separate study published in 2019 in the International Journal on Digital Libraries. It used an automated approach to compare a much larger number of preprints—nearly 2500 from bioRxiv, and more than 12,000 on arXiv, the physical sciences preprint server—with versions later published in refereed journals. The team employed statistical techniques to measure differences in the preprint and refereed texts and found little difference.

That doesn’t mean that peer review is worthless, says Martin Klein of Los Alamos National Laboratory, lead author of that study. It does suggest, though, the need for more research on the respective benefits that preprints and journal articles offer for scientific communication, he says. In Klein’s own discipline of computer science, he notes, researchers have already accepted and relied on preprints for decades, with the understanding that peer reviews may later refine them. “To me, this is an absolute no-brainer.”