History is replete with stories of scientists who hid their ideas from their competition; consider Leonardo da Vinci, whose odd backward writing may have been partly motivated by fear of thieves, or Isaac Newton, who concealed one idea by writing it in the form of an anagram. Science has long been a dog-eat-dog world.
"Everybody makes mistakes. And if you don't expose your raw data, nobody will find your mistakes." --Jean-Claude Bradley
So it may seem odd that a handful of scientists are going to similar lengths to share not just their results but also, sometimes, their raw data -- even their lab notebooks -- often in real time. They're part of a movement called Open Science, or, more specifically, Open Notebook Science, whose motto is "no insider information." (For more open-science terminology, see the box below.)
At first glance, going "open" would seem like a serious career risk -- years of work could be for nothing if a competitor uses your work to beat you to publication -- but many practitioners of openness say the benefits outweigh those risks. The benefits include increased opportunities for collaboration, more feedback from colleagues, and a greater likelihood that the research will get to the people who can use it. Counterintuitively, practitioners say that being open supports their claims of priority and relieves their anxiety about getting ripped off.
"I definitely believe that science in general is more effective the more open people are," says evolutionary biologist Jonathan Eisen of the University of California (UC), Davis, who keeps much of his research open. "There are unquestionably risks for people that come with [openness], but the benefits to society are enormous. Given that taxpayers are paying for our work, I think that the default should be to be open unless you can prove that it's a bad idea."
Steve Koch, an experimental biophysicist at the University of New Mexico in Albuquerque, is one of the most avid Open Notebook Science practitioners. About 2 years ago, he started sharing some of his ideas and results through his blog. At the same time, one of his graduate students, Anthony Salvagno, did an open experiment, reporting all of his results on the OpenWetWare site, which hosts wiki-style lab notebooks. Both projects led to feedback and collaboration offers from researchers around the world. They felt that they had hit on something big, and a year ago Koch decided to make his lab entirely open.
That was easier said than done. The specific challenges vary from field to field, but putting data online means putting it in a form that other scientists -- and, increasingly, computers -- can use (see box). Astronomer Sarah Kendrew of the Leiden Observatory in the Netherlands points to the Virtual Observatory (VO) project -- a "super-repository" for astronomy data -- as a good example of that struggle. "Getting data VO compliant, if it wasn't planned in from the start, requires a lot of resources," she writes by e-mail. "I don't think smaller or older observatories think it's worth their while, even though they produce excellent data."
The challenge of opening up your data
Open data may sound idyllic, but there are some real, practical challenges to getting it done. For one thing, as most people who have ever published a blog realizes, not everything posted on the Internet gets noticed and utilized.
Eisen puts it this way: "Just the technical details of releasing that data is not straightforward. Where do you put it? ... What format do you exactly put it in? How do you tell people that it has got a different data-release policy than the other data at that place?" Even when there is a prescribed format, such as for GenBank, he says that submitting the data "is not a trivial activity." And "that's just sequence data," he continues. "Imagine experimental data," which comes in infinite forms with an immeasurably wide range of experimental conditions.
And then there's the issue of licensing. Should you impose restrictions? "Anytime you have restrictions on use and reuse and rerelease of the data, it just becomes a complicated mess as to what you are allowed or not allowed to do," Eisen continues. So Eisen advocates completely open data. "If you release the data with no restrictions, it's very clear: anybody can do anything." The Panton Principles provide guidelines on how to liberate your data.
Doing it well might be challenging, but just getting it out there in the open, in any form, is a useful step, says Drexel's Bradley. "It's much, much easier to get automation involved in the scientific process if you make data open." The point is to get it out there, to put your data in play. Then "anyone in the world can come in, write a script, have some AI [artificial intelligence] interact with the data, and you never know how it's going to be used in a productive way."
True Open Notebook practitioners don't limit their open content to well-formatted data. Ideally, they want their Open Notebooks to include every last scribble their lab generates. Koch admires the integrated system set up by chemist Jean-Claude Bradley of Drexel University in Philadelphia, Pennsylvania, who coined the term "Open Notebook Science" and runs one of the most open labs in the world. Bradley uses wikis, blogging software, Google spreadsheets, and specialized software to make all of his lab notebooks and data available in real time. "It's sort of going away from a culture of trust to one of proof," Bradley says. "Everybody makes mistakes. And if you don't expose your raw data, nobody will find your mistakes."
And people do find mistakes, Eisen says. The first time he released genome-sequencing data with no restrictions, another scientist wrote within hours to tell him that there was a glitch in the database: Anthrax sequences had slipped in with those of the organism they were sequencing. "We wouldn't have found this for probably a year," Eisen says. In a sense, the decision to release the data with no restrictions transformed potential rivals into collaborators. The contributions from those outsiders far outweighed the risks of getting scooped, he says.
Of course, most scientists aren't eager to admit their errors. Carl Boettiger, a graduate student in the population biology program at UC Davis and an organizer of the UC Davis Open Science group, started an open notebook this year. "When you're the only one exposing your mistakes, that's a little bit scary," he says. But he thinks the pressure makes him a better scientist. "Because you do worry a little about it, your notebook is that much better."
Eisen points to a generational difference in the willingness to adopt an open approach. He finds that people under 40 are more used to making information about themselves public. "It's very different than it was, say, 10 years ago," he says. "A lot of young scientists, it doesn't faze them as much."
Risks and rewards
Koch says he isn't too worried about getting scooped, even though -- unlike most of his fellow Open Notebook Science practitioners -- he is not yet tenured. Open Notebook Science advocates claim that being open may protect a scientist's ideas rather than exposing them to theft. Newton's decision to conceal his findings within an anagram made it harder for him to prove priority over rival Gottfried Leibniz. Open Notebook scientists say all they need to do is point to their open notebooks to show that they had an idea or found a result first. "I've been able to cite my [online] lab notebook pages in a peer-reviewed paper," Bradley says. "That's clearly citing your priority." In the case of an unethical theft of ideas, "the published track record would make it easier to shame the person who did the scooping," Koch wrote in his blog.
Koch says that having an open lab may have even helped him get a grant. "The PI [principal investigator] describes an unusual and extremely attractive plan to run the proposed research project as open access," an anonymous reviewer wrote in response to one of Koch's proposals. The other reviewers agreed. Koch also thinks the attention he has gotten from being an advocate of Open Science will help him in his tenure bid.
Still, many researchers just don't think the concept of open science is widely applicable. Brian Krueger, a molecular biologist at the University of Florida is no open-science skeptic: He runs a lab with an open slant and even started a social network -- LabSpaces.net -- that aims to make the scientific process more public and open. But Krueger is measured, at best, in his enthusiasm for open-notebook science. "I don't think it's practical to expect a competitive field like science to turn utopian and selfless for the greater good," he writes in an e-mail.
Bradley acknowledges that the downsides of openness make it impossible for some to adopt. Before switching his lab to an Open Notebook setup in 2005, he worked in nanotechnology; he now works on malaria. He could not have practiced Open Notebook Science then, he says, because it would have interfered with his ability to get patents. "It's not a very optimal strategy if you're interested in intellectual property," he says. It's also not good if you work with confidential information. Nonetheless, Bradley has been surprised by the variety of scientists who have adopted Open Notebook techniques. "There's even people who are in computation and theory who started to use it, and I didn't really have that in mind originally."
Another downside -- especially for scientists seeking tenure -- is that many traditional journals will not publish open research. Instead of publishing in the American Chemical Society journals, Bradley publishes in the open-access peer-reviewed journals Chemistry Central, the Journal of Cheminformatics, and the Journal of Visualized Experiments. That's fine for him, he says, but it may not work for a young professor filling out a tenure dossier.
Bradley suggests that early-career researchers start by making a side project open so that they can get the benefits of doing Open Notebook Science while still pursuing the traditional measures of success -- like placing papers in the most prestigious journals. But even a scaled-down approach can be difficult, he says, because collaborators and PIs are often resistant.
That's okay, he says. "It's not necessary for even a large fraction of scientists to take this approach to have a pretty significant impact, because it's a network effect. As soon as you add one more node into the network, it greatly amplifies what we're able to do."
One way Bradley is adding nodes is through the Open Notebook Science Challenge, which he started in 2008. Undergraduate and graduate students compete for $500 prizes by performing open-notebook solubility measurements. The result is a free, increasingly comprehensive list of solubility measurements available to chemists around the world. Aside from the money, participants get feedback from an international panel of judges. "It's a way to mentor students who would otherwise have no contact with people of this caliber," Bradley says.
Once people get started on Open Notebook Science, Koch says, it's hard to go back. His graduate students have embraced the practice, sometimes fervently. "When [Koch] first approached me about the lab policies, I didn't know what 'open' meant and I had assumed that every lab was like that," Salvagno writes by e-mail. "I just figured that science was open. When I found out the truth, I was excited to be a pioneer. I could never go back."
Open Science: An umbrella term sometimes used to unite the concepts of Open Notebook Science, Open Data, Open Access, and sometimes Open Source.
Open Notebook Science: At its most pure, the practice of putting the entire record of a research project -- lab notebooks, raw data, drafts of proposals and papers -- online in real time. Moderated versions of open-notebook science may involve some time delays or restrictions on releasing some types of materials, such as proposal drafts.
Open Data: The practice of making data freely available, without restrictions such as copyright.
Open Access: In scholarly publishing (including peer-reviewed scientific journals), the practice of making all journal articles available free.
Open Source: Probably the best-known "open" term, the practice of making computer source code freely available to use and modify. In a model emulated by open-science practitioners, open-source development has facilitated huge collaborative projects, yielding products such as the operating system Linux.