Megascience:
'Omics Data Sharing
Dawn Field,1,*,
,
Susanna-Assunta Sansone,1,2,
Amanda Collis,3,
Tim Booth,1
Peter Dukes,4
Susan K. Gregurick,5
Karen Kennedy,6
Patrik Kolar,7
Eugene Kolker,8
Mary Maxon,9
Siân Millard,10
Alexis-Michel Mugabushaka,11
Nicola Perrin,12
Jacques E. Remacle,7
Karin Remington,13
Philippe Rocca-Serra,12
Chris F. Taylor,12
Mark Thorley,14
Bela Tiwari,1
John Wilbanks15
Development of high-throughput genomic and postgenomic technologies has caused a change in approaches to data handling and processing (1). One biological sample might be used to generate many kinds of "big" data in parallel, such as genome sequence (genomics), patterns of gene and protein expression (transcriptomics and proteomics), and metabolite concentrations and fluxes (metabolomics). Extensive computer manipulations are required for even basic analyses of such data; the challenges mount further when two or more studies' outputs must be compared or integrated.
1 U.K. Natural Environment Research Council (NERC), Environmental Bioinformatics Centre.
2 European Molecular Biology Laboratory (EMBL) Outstation, The European Bioinformatics Institute (EBI).
3 U.K. Biotechnology and Biological Sciences Research Council.
4 U.K. Medical Research Council.
5 U.S. Department of Energy.
6 Genome Canada and Wellcome Trust Sanger Institute.
7 Unit for Genomics and Systems Biology, European Commission.
8 Seattle Childrens Hospital.
9 Marine Microbiology Initiative, Gordon and Betty Moore Foundation.
10 U.K. Economic and Social Research Council.
11 European Science Foundation.
12 The Wellcome Trust.
13 U.S. National Institute of General Medical Science, NIH.
14 NERC.
15 Science Commons.
* Full author affiliations are available on Science Online.
These authors contributed equally to this article.
Author for correspondence. E-mail: dfield{at}ceh.ac.uk