Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.

Site Tools

  • AAAS
  • Subscribe
  • Feedback

Site Search

Search Advanced

Science 24 March 2000:
Vol. 287. no. 5461, pp. 2196 - 2204
DOI: 10.1126/science.287.5461.2196

Review

A Whole-Genome Assembly of Drosophila

Eugene W. Myers, 1* Granger G. Sutton, 1 Art L. Delcher, 1 Ian M. Dew, 1 Dan P. Fasulo, 1 Michael J. Flanigan, 1 Saul A. Kravitz, 1 Clark M. Mobarry, 1 Knut H. J. Reinert, 1 Karin A. Remington, 1 Eric L. Anson, 1 Randall A. Bolanos, 1 Hui-Hsien Chou, 1 Catherine M. Jordan, 1 Aaron L. Halpern, 1 Stefano Lonardi, 1 Ellen M. Beasley, 1 Rhonda C. Brandon, 1 Lin Chen, 1 Patrick J. Dunn, 1 Zhongwu Lai, 1 Yong Liang, 1 Deborah R. Nusskern, 1 Ming Zhan, 1 Qing Zhang, 1 Xiangqun Zheng, 1 Gerald M. Rubin, 2 Mark D. Adams, 1 J. Craig Venter 1

We report on the quality of a whole-genome assembly of Drosophila melanogaster and the nature of the computer algorithms that accomplished it. Three independent external data sources essentially agree with and support the assembly's sequence and ordering of contigs across the euchromatic portion of the genome. In addition, there are isolated contigs that we believe represent nonrepetitive pockets within the heterochromatin of the centromeres. Comparison with a previously sequenced 2.9- megabase region indicates that sequencing accuracy within nonrepetitive segments is greater than 99.99% without manual curation. As such, this initial reconstruction of the Drosophila sequence should be of substantial value to the scientific community.

1 Celera Genomics, Inc., 45 West Gude Drive, Rockville, MD 20850, USA.
2 Howard Hughes Medical Institute, Berkeley Drosophila Genome Project, University of California, Berkeley, CA 94720, USA.
*   To whom correspondence should be sent: Gene.Myers{at}celera.com


Read the Full Text


THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:
Velvet: Algorithms for de novo short read assembly using de Bruijn graphs.
D. R. Zerbino and E. Birney (2008)
Genome Res. 18, 821-829
   Abstract »    Full Text »    PDF »
Consensus generation and variant detection by Celera Assembler.
G. Denisov, B. Walenz, A. L. Halpern, J. Miller, N. Axelrod, S. Levy, and G. Sutton (2008)
Bioinformatics 24, 1035-1040
   Abstract »    Full Text »    PDF »
Proteogenomics: needs and roles to be filled by proteomics in genome annotation.
C. Ansong, S. O. Purvine, J. N. Adkins, M. S. Lipton, and R. D. Smith (2008)
Brief Funct Genomic Proteomic
   Abstract »    Full Text »    PDF »
Figaro: a novel statistical method for vector sequence removal.
J. R. White, M. Roberts, J. A. Yorke, and M. Pop (2008)
Bioinformatics 24, 462-467
   Abstract »    Full Text »    PDF »
Assembly reconciliation.
A. V. Zimin, D. R. Smith, G. Sutton, and J. A. Yorke (2008)
Bioinformatics 24, 42-45
   Abstract »    Full Text »    PDF »
Intron Loss and Gain in Drosophila.
J. Coulombe-Huntington and J. Majewski (2007)
Mol. Biol. Evol. 24, 2842-2850
   Abstract »    Full Text »    PDF »
Discovering and detecting transposable elements in genome sequences.
C. M. Bergman and H. Quesneville (2007)
Brief Bioinform 8, 382-392
   Abstract »    Full Text »    PDF »
SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing.
J. C. Dohm, C. Lottaz, T. Borodina, and H. Himmelbauer (2007)
Genome Res. 17, 1697-1706
   Abstract »    Full Text »    PDF »
Genome browsing with Ensembl: a practical overview.
G. Spudich, X. M. Fernandez-Suarez, and E. Birney (2007)
Brief Funct Genomic Proteomic
   Abstract »    Full Text »    PDF »
DNA sequencing: bench to bedside and beyond.
C. A. Hutchison III (2007)
Nucleic Acids Res. 35, 6227-6237
   Abstract »    Full Text »    PDF »
Identifying bacterial genes and endosymbiont DNA with Glimmer.
A. L. Delcher, K. A. Bratke, E. C. Powers, and S. L. Salzberg (2007)
Bioinformatics 23, 673-679
   Abstract »    Full Text »    PDF »
From the Cover: Characterization of a marine gammaproteobacterium capable of aerobic anoxygenic photosynthesis.
B. M. Fuchs, S. Spring, H. Teeling, C. Quast, J. Wulf, M. Schattenhofer, S. Yan, S. Ferriera, J. Johnson, F. O. Glockner, et al. (2007)
PNAS 104, 2891-2896
   Abstract »    Full Text »    PDF »
An algorithm for assembly of ordered restriction maps from single DNA molecules.
A. Valouev, D. C. Schwartz, S. Zhou, and M. S. Waterman (2006)
PNAS 103, 15770-15775
   Abstract »    Full Text »    PDF »
Physical map-assisted whole-genome shotgun sequence assemblies..
R. L. Warren, D. Varabei, D. Platt, X. Huang, D. Messina, S.-P. Yang, J. W. Kronstad, M. Krzywinski, W. C. Warren, J. W. Wallis, et al. (2006)
Genome Res. 16, 768-775
   Abstract »    Full Text »    PDF »
The Rice Mitochondrial Genomes and Their Variations.
X. Tian, J. Zheng, S. Hu, and J. Yu (2006)
Plant Physiology 140, 401-410
   Abstract »    Full Text »    PDF »
Application of a superword array in genome assembly.
X. Huang, S.-P. Yang, A. T. Chinwalla, L. W. Hillier, P. Minx, E. R. Mardis, and R. K. Wilson (2006)
Nucleic Acids Res. 34, 201-205
   Abstract »    Full Text »    PDF »
Beware of mis-assembled genomes.
S. L. Salzberg and J. A. Yorke (2005)
Bioinformatics 21, 4320-4321
   Full Text »    PDF »
The genome of Salinibacter ruber: Convergence and gene exchange among hyperhalophilic bacteria and archaea.
E. F. Mongodin, K. E. Nelson, S. Daugherty, R. T. DeBoy, J. Wister, H. Khouri, J. Weidman, D. A. Walsh, R. T. Papke, G. Sanchez Perez, et al. (2005)
PNAS 102, 18147-18152
   Abstract »    Full Text »    PDF »
Genomics of the fungal kingdom: Insights into eukaryotic biology.
J. E. Galagan, M. R. Henn, L.-J. Ma, C. A. Cuomo, and B. Birren (2005)
Genome Res. 15, 1620-1631
   Abstract »    Full Text »    PDF »
Drosophila melanogaster: A case study of a model genomic sequence and its consequences.
M. Ashburner and C. M. Bergman (2005)
Genome Res. 15, 1661-1667
   Abstract »    Full Text »    PDF »
Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial "pan-genome".
H. Tettelin, V. Masignani, M. J. Cieslewicz, C. Donati, D. Medini, N. L. Ward, S. V. Angiuoli, J. Crabtree, A. L. Jones, A. S. Durkin, et al. (2005)
PNAS 102, 13950-13955
   Abstract »    Full Text »    PDF »
Whole-Genome Sequence Analysis of Pseudomonas syringae pv. phaseolicola 1448A Reveals Divergence among Pathovars in Genes Involved in Virulence and Transposition.
V. Joardar, M. Lindeberg, R. W. Jackson, J. Selengut, R. Dodson, L. M. Brinkac, S. C. Daugherty, R. DeBoy, A. S. Durkin, M. G. Giglio, et al. (2005)
J. Bacteriol. 187, 6488-6498
   Abstract »    Full Text »    PDF »
Assembly of polymorphic genomes: Algorithms and application to Ciona savignyi.
J. P. Vinson, D. B. Jaffe, K. O'Neill, E. K. Karlsson, N. Stange-Thomann, S. Anderson, J. P. Mesirov, N. Satoh, Y. Satou, C. Nusbaum, et al. (2005)
Genome Res. 15, 1127-1135
   Abstract »    Full Text »    PDF »
Functional Genomic Analysis of the Wnt-Wingless Signaling Pathway.
R. DasGupta, A. Kaykas, R. T. Moon, and N. Perrimon (2005)
Science 308, 826-833
   Abstract »    Full Text »    PDF »
A graph based algorithm for generating EST consensus sequences.
K. Malde, E. Coward, and I. Jonassen (2005)
Bioinformatics 21, 1371-1375
   Abstract »    Full Text »    PDF »
From Mapping to Sequencing, Post-sequencing and Beyond.
T. Sasaki, T. Matsumoto, B. A. Antonio, and Y. Nagamura (2005)
Plant Cell Physiol. 46, 3-13
   Abstract »    Full Text »    PDF »
Satellite DNA From the Y Chromosome of the Malaria Vector Anopheles gambiae.
J. Krzywinski, D. Sangare, and N. J. Besansky (2005)
Genetics 169, 185-196
   Abstract »    Full Text »    PDF »
The molecular structure of the DNA fragments eliminated during chromatin diminution in Cyclops kolensis.
S. Degtyarev, T. Boykova, A. Grishanin, S. Belyakin, N. Rubtsov, T. Karamysheva, G. Makarevich, A. Akifyev, and I. Zhimulev (2004)
Genome Res. 14, 2287-2294
   Abstract »    Full Text »    PDF »
Utility of Different Gene Enrichment Approaches Toward Identifying and Sequencing the Maize Gene Space.
N. M. Springer, X. Xu, and W. B. Barbazuk (2004)
Plant Physiology 136, 3023-3033
   Abstract »    Full Text »    PDF »
De Novo Repeat Classification and Fragment Assembly.
P. A. Pevzner, H. Tang, and G. Tesler (2004)
Genome Res. 14, 1786-1796
   Abstract »    Full Text »    PDF »
Functional Properties of the Drosophila melanogaster Inositol 1,4,5-Trisphosphate Receptor Mutants.
S. Srikanth, Z. Wang, H. Tu, S. Nair, M. K. Mathew, G. Hasan, and I. Bezprozvanny (2004)
Biophys. J. 86, 3634-3646
   Abstract »    Full Text »    PDF »
Identification of anthrax toxin genes in a Bacillus cereus associated with an illness resembling inhalation anthrax.
A. R. Hoffmaster, J. Ravel, D. A. Rasko, G. D. Chapman, M. D. Chute, C. K. Marston, B. K. De, C. T. Sacchi, C. Fitzgerald, L. W. Mayer, et al. (2004)
PNAS 101, 8449-8454
   Abstract »    Full Text »    PDF »
Isotopolog perturbation techniques for metabolic networks: Metabolic recycling of nutritional glucose in Drosophila melanogaster.
W. Eisenreich, C. Ettenhuber, R. Laupitz, C. Theus, and A. Bacher (2004)
PNAS 101, 6764-6769
   Abstract »    Full Text »    PDF »
Environmental Genome Shotgun Sequencing of the Sargasso Sea.
J. C. Venter, K. Remington, J. F. Heidelberg, A. L. Halpern, D. Rusch, J. A. Eisen, D. Wu, I. Paulsen, K. E. Nelson, W. Nelson, et al. (2004)
Science 304, 66-74
   Abstract »    Full Text »    PDF »
The Atlas Genome Assembly System.
P. Havlak, R. Chen, K. J. Durbin, A. Egan, Y. Ren, X.-Z. Song, G. M. Weinstock, and R. A. Gibbs (2004)
Genome Res. 14, 721-732
   Abstract »    Full Text »    PDF »
Whole-genome shotgun assembly and comparison of human genome assemblies.
S. Istrail, G. G. Sutton, L. Florea, A. L. Halpern, C. M. Mobarry, R. Lippert, B. Walenz, H. Shatkay, I. Dew, J. R. Miller, et al. (2004)
PNAS 101, 1916-1921
   Abstract »    Full Text »    PDF »
Automated correction of genome sequence errors.
P. Gajer, M. Schatz, and S. L. Salzberg (2004)
Nucleic Acids Res. 32, 562-569
   Abstract »    Full Text »    PDF »
Hierarchical Scaffolding With Bambus.
M. Pop, D. S. Kosack, and S. L. Salzberg (2004)
Genome Res. 14, 149-159
   Abstract »    Full Text »    PDF »
The genome of Nanoarchaeum equitans: Insights into early archaeal evolution and derived parasitism.
E. Waters, M. J. Hohn, I. Ahel, D. E. Graham, M. D. Adams, M. Barnstead, K. Y. Beeson, L. Bibbs, R. Bolanos, M. Keller, et al. (2003)
PNAS 100, 12984-12988
   Abstract »    Full Text »    PDF »
A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules.
J. M. Stuart, E. Segal, D. Koller, and S. K. Kim (2003)
Science 302, 249-255
   Abstract »    Full Text »    PDF »
The Dog Genome: Survey Sequencing and Comparative Analysis.
E. F. Kirkness, V. Bafna, A. L. Halpern, S. Levy, K. Remington, D. B. Rusch, A. L. Delcher, M. Pop, W. Wang, C. M. Fraser, et al. (2003)
Science 301, 1898-1903
   Abstract »    Full Text »    PDF »
PCAP: A Whole-Genome Assembly Program.
X. Huang, J. Wang, S. Aluru, S.-P. Yang, and L. Hillier (2003)
Genome Res. 13, 2164-2170
   Abstract »    Full Text »    PDF »
A Gene Recommender Algorithm to Identify Coexpressed Genes in C. elegans.
A. B. Owen, J. Stuart, K. Mach, A. M. Villeneuve, and S. Kim (2003)
Genome Res. 13, 1828-1837
   Abstract »    Full Text »    PDF »
Sequence Divergence Within Transposable Element Families in the Drosophila melanogaster Genome.
E. Lerat, C. Rizzon, and C. Biemont (2003)
Genome Res. 13, 1889-1896
   Abstract »    Full Text »    PDF »
UTILITY OF COMPARATIVE ANCHOR-TAGGED SEQUENCES AS PHYSICAL ANCHORS FOR COMPARATIVE GENOME ANALYSIS AMONG THE CULICIDAE.
E. W. CHAMBERS, D. D. LOVIN, and D. W. SEVERSON (2003)
Am J Trop Med Hyg 69, 98-104
   Abstract »    Full Text »    PDF »
Colonization of Heterochromatic Genes by Transposable Elements in Drosophila.
P. Dimitri, N. Junakovic, and B. Arca (2003)
Mol. Biol. Evol. 20, 503-512
   Abstract »    Full Text »    PDF »
Novel Isoforms of Dlg Are Fundamental for Neuronal Development in Drosophila.
C. Mendoza, P. Olguin, G. Lafferte, U. Thomas, S. Ebitsch, E. D. Gundelfinger, M. Kukuljan, and J. Sierralta (2003)
J. Neurosci. 23, 2093-2101
   Abstract »    Full Text »    PDF »
Sequence Analysis of a Functional Drosophila Centromere.
X. Sun, H. D. Le, J. M. Wahlstrom, and G. H. Karpen (2003)
Genome Res. 13, 182-194
   Abstract »    Full Text »    PDF »
Knockout Targeting of the Drosophila Nap1 Gene and Examination of DNA Repair Tracts in the Recombination Products.
S. Lankenau, T. Barnickel, J. Marhold, F. Lyko, B. M. Mechler, and D.-H. Lankenau (2003)
Genetics 163, 611-623
   Abstract »    Full Text »    PDF »
Whole-Genome Sequence Assembly for Mammalian Genomes: Arachne 2.
D. B. Jaffe, J. Butler, S. Gnerre, E. Mauceli, K. Lindblad-Toh, J. P. Mesirov, M. C. Zody, and E. S. Lander (2003)
Genome Res. 13, 91-96
   Abstract »    Full Text »    PDF »
The Phusion Assembler.
J. C. Mullikin and Z. Ning (2003)
Genome Res. 13, 81-90
   Abstract »    Full Text »    PDF »
Patterns of Insertion and Deletion in Contrasting Chromatin Domains.
J. P. Blumenstiel, D. L. Hartl, and E. R. Lozovsky (2002)
Mol. Biol. Evol. 19, 2211-2225
   Abstract »    Full Text »    PDF »
Remarkable compartmentalization of transposable elements and pseudogenes in the heterochromatin of the Tetraodon nigroviridis genome.
C. Dasilva, H. Hadji, C. Ozouf-Costaz, S. Nicaud, O. Jaillon, J. Weissenbach, and H. R. Crollius (2002)
PNAS 99, 13636-13641
   Abstract »    Full Text »    PDF »
The Genome Sequence of the Malaria Mosquito Anopheles gambiae.
R. A. Holt, G. M. Subramanian, A. Halpern, G. G. Sutton, R. Charlab, D. R. Nusskern, P. Wincker, A. G. Clark, J. M. C. Ribeiro, R. Wides, et al. (2002)
Science 298, 129-149
   Abstract »    Full Text »    PDF »
Current methods of gene prediction, their strengths and weaknesses.
C. Mathe, M.-F. Sagot, T. Schiex, and P. Rouze (2002)
Nucleic Acids Res. 30, 4103-4117
   Abstract »    Full Text »    PDF »
Endophilin Is Critically Required for Synapse Formation and Function in Drosophila melanogaster.
R. Rikhy, V. Kumar, R. Mittal, and K. S. Krishnan (2002)
J. Neurosci. 22, 7478-7484
   Abstract »    Full Text »    PDF »
mei-P22 Encodes a Chromosome-Associated Protein Required for the Initiation of Meiotic Recombination in Drosophila melanogaster.
H. Liu, J. K. Jang, N. Kato, and K. S. McKim (2002)
Genetics 162, 245-258
   Abstract »    Full Text »    PDF »
Systematic sequencing of cDNA clones using the transposon Tn5.
Y. Shevchenko, G. G. Bouffard, Y. S. N. Butterfield, R. W. Blakesley, J. L. Hartley, A. C. Young, M. A. Marra, S. J. M. Jones, J. W. Touchman, and E. D. Green (2002)
Nucleic Acids Res. 30, 2469-2477
   Abstract »    Full Text »    PDF »
On the Abundance and Distribution of Transposable Elements in the Genome of Drosophila melanogaster.
C. Bartolome, X. Maside, and B. Charlesworth (2002)
Mol. Biol. Evol. 19, 926-937
   Abstract »    Full Text »    PDF »
A Comparison of Whole-Genome Shotgun-Derived Mouse Chromosome 16 and the Human Genome.
R. J. Mural, M. D. Adams, E. W. Myers, H. O. Smith, G. L. G. Miklos, R. Wides, A. Halpern, P. W. Li, G. G. Sutton, J. Nadeau, et al. (2002)
Science 296, 1661-1671
   Abstract »    Full Text »    PDF »
RePS: A Sequence Assembler That Masks Exact Repeats Identified from the Shotgun Data.
J. Wang, G. K.-S. Wong, P. Ni, Y. Han, X. Huang, J. Zhang, C. Ye, Y. Zhang, J. Hu, K. Zhang, et al. (2002)
Genome Res. 12, 824-831
   Abstract »    Full Text »    PDF »
Science, medicine, and the future: Bioinformatics.
A. Bayat (2002)
BMJ 324, 1018-1022
   Full Text »    PDF »
Genome-wide Transcriptional Orchestration of Circadian Rhythms in Drosophila.
H. R. Ueda, A. Matsumoto, M. Kawamura, M. Iino, T. Tanimura, and S. Hashimoto (2002)
J. Biol. Chem. 277, 14048-14052
   Abstract »    Full Text »    PDF »
Whole-genome disassembly.
P. Green (2002)
PNAS 99, 4143-4144
   Full Text »    PDF »
On the sequencing and assembly of the human genome.
E. W. Myers, G. G. Sutton, H. O. Smith, M. D. Adams, and J. C. Venter (2002)
PNAS 99, 4145-4146
   Full Text »    PDF »
Predicting Gene Ontology Functions from ProDom and CDD Protein Domains.
J. Schug, S. Diskin, J. Mazzarelli, B. P. Brunk, and C. J. Stoeckert Jr. (2002)
Genome Res. 12, 648-655
   Abstract »    Full Text »    PDF »
subito Encodes a Kinesin-like Protein Required for Meiotic Spindle Pole Formation in Drosophila melanogaster.
K. L. Giunta, J. K. Jang, E. A. Manheim, G. Subramanian, and K. S. McKim (2002)
Genetics 160, 1489-1501
   Abstract »    Full Text »    PDF »
Molecular Biologist's Guide to Proteomics.
P. R. Graves and T. A. J. Haystead (2002)
Microbiol. Mol. Biol. Rev. 66, 39-63
   Abstract »    Full Text »    PDF »
A new approach to genome mapping and sequencing: slalom libraries.
V. I. Zabarovska, R. Z. Gizatullin, A. N. Al-Amin, R. Podowski, A. I. Protopopov, S. Lofdahl, C. Wahlestedt, G. Winberg, V. I. Kashuba, I. Ernberg, et al. (2002)
Nucleic Acids Res. 30, e6
   Abstract »    Full Text »    PDF »
ARACHNE: A Whole-Genome Shotgun Assembler.
S. Batzoglou, D. B. Jaffe, K. Stanley, J. Butler, S. Gnerre, E. Mauceli, B. Berger, J. P. Mesirov, and E. S. Lander (2002)
Genome Res. 12, 177-189
   Abstract »    Full Text »    PDF »
A genome-wide analysis of immune responses in Drosophila.
P. Irving, L. Troxler, T. S. Heuer, M. Belvin, C. Kopczynski, J.-M. Reichhart, J. A. Hoffmann, and C. Hetru (2001)
PNAS
   Abstract »    Full Text »    PDF »
Expanding the Diversity of the IS630-Tc1-mariner Superfamily: Discovery of a Unique DD37E Transposon and Reclassification of the DD37D and DD39D Transposons.
H. Shao and Z. Tu (2001)
Genetics 159, 1103-1115
   Abstract »    Full Text »    PDF »
A Clone-Array Pooled Shotgun Strategy for Sequencing Large Genomes.
W.-W. Cai, R. Chen, R. A. Gibbs, and A. Bradley (2001)
Genome Res. 11, 1619-1623
   Abstract »    Full Text »    PDF »
Assembling Puzzles from Preassembled Blocks.
P. A. Pevzner (2001)
Genome Res. 11, 1461-1462
   Full Text »    PDF »
An Eulerian path approach to DNA fragment assembly.
P. A. Pevzner, H. Tang, and M. S. Waterman (2001)
PNAS 98, 9748-9753
   Abstract »    Full Text »    PDF »
Microarray analysis of trophoblast differentiation: gene expression reprogramming in key gene function categories.
B. J. ARONOW, B. D. RICHARDSON, and S. HANDWERGER (2001)
Physiol Genomics 6, 105-116
   Abstract »    Full Text »    PDF »
Spatial and temporal control of RNA stability.
A. Bashirullah, R. L. Cooperstock, and H. D. Lipshitz (2001)
PNAS 98, 7025-7028
   Abstract »    Full Text »    PDF »
Characterization of the flamenco Region of the Drosophila melanogaster Genome.
V. Robert, N. Prud'homme, A. Kim, A. Bucheton, and A. Pelisson (2001)
Genetics 158, 701-713
   Abstract »    Full Text »    PDF »
From First Base: The Sequence of the Tip of the X Chromosome of Drosophila melanogaster, a Comparison of Two Sequencing Strategies.
P. V. Benos, M. K. Gatt, L. Murphy, D. Harris, B. Barrell, C. Ferraz, S. Vidal, C. Brun, J. Demaille, E. Cadieu, et al. (2001)
Genome Res. 11, 710-730
   Abstract »    Full Text »
The Sequence of the Human Genome.
J. C. Venter, M. D. Adams, E. W. Myers, P. W. Li, R. J. Mural, G. G. Sutton, H. O. Smith, M. Yandell, C. A. Evans, R. A. Holt, et al. (2001)
Science 291, 1304-1351
   Abstract »    Full Text »
Genetic Research and Nutritional Individua