Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.

Site Tools

  • AAAS
  • Subscribe
  • Feedback

Site Search

Search Advanced

Originally published in Science Express on 27 September 2007
Science 19 October 2007:
Vol. 318. no. 5849, pp. 420 - 426
DOI: 10.1126/science.1149504

Research Articles

Paired-End Mapping Reveals Extensive Structural Variation in the Human Genome

Jan O. Korbel,1,2* Alexander Eckehart Urban,3* Jason P. Affourtit,4* Brian Godwin,4 Fabian Grubert,5 Jan Fredrik Simons,4 Philip M. Kim,1 Dean Palejev,5 Nicholas J. Carriero,6 Lei Du,4 Bruce E. Taillon,4 Zhoutao Chen,4 Andrea Tanzer,7,8,9 A. C. Eugenia Saunders,3 Jianxiang Chi,10 Fengtang Yang,10 Nigel P. Carter,10 Matthew E. Hurles,10 Sherman M. Weissman,5 Timothy T. Harkins,11 Mark B. Gerstein,1,6,12 Michael Egholm,4{dagger} Michael Snyder1,3{dagger}

Structural variation of the genome involves kilobase- to megabase-sized deletions, duplications, insertions, inversions, and complex combinations of rearrangements. We introduce high-throughput and massive paired-end mapping (PEM), a large-scale genome-sequencing method to identify structural variants (SVs) ~3 kilobases (kb) or larger that combines the rescue and capture of paired ends of 3-kb fragments, massive 454 sequencing, and a computational approach to map DNA reads onto a reference genome. PEM was used to map SVs in an African and in a putatively European individual and identified shared and divergent SVs relative to the reference genome. Overall, we fine-mapped more than 1000 SVs and documented that the number of SVs among humans is much larger than initially hypothesized; many of the SVs potentially affect gene function. The breakpoint junction sequences of more than 200 SVs were determined with a novel pooling strategy and computational analysis. Our analysis provided insights into the mechanisms of SV formation in humans.

1 Molecular Biophysics and Biochemistry Department, Yale University, New Haven, CT 06520, USA.
2 European Molecular Biology Laboratory, 69117 Heidelberg, Germany.
3 Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT 06520, USA.
4 454 Life Sciences, A Roche Company, Branford, CT 06405, USA.
5 Department of Genetics, Yale University School of Medicine, New Haven, CT 06520, USA.
6 Department of Computer Science, Yale University, New Haven, CT 06520, USA.
7 Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT 06520, USA.
8 Department of Computer Science, University of Leipzig, 04107 Leipzig, Germany.
9 Institute for Theoretical Chemistry, University of Vienna, 1090 Vienna, Austria.
10 The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK.
11 Roche Applied Science, Indianapolis, IN 46250, USA.
12 Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.

* These authors contributed equally to this work.

{dagger} To whom correspondence should be addressed. E-mail: megholm{at}454.com (M.E.); michael.snyder{at}yale.edu (M.S.)

Read the Full Text


THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:
Characterization of six human disease-associated inversion polymorphisms.
F. Antonacci, J. M. Kidd, T. Marques-Bonet, M. Ventura, P. Siswara, Z. Jiang, and E. E. Eichler (2009)
Hum. Mol. Genet. 18, 2555-2566
   Abstract »    Full Text »    PDF »
Integrated study of copy number states and genotype calls using high-density SNP arrays.
W. Sun, F. A. Wright, Z. Tang, S. H. Nordgard, P. V. Loo, T. Yu, V. N. Kristensen, and C. M. Perou (2009)
Nucleic Acids Res.
   Abstract »    Full Text »    PDF »
Aneuploidy: From a Physiological Mechanism of Variance to Down Syndrome.
M. Dierssen, Y. Herault, and X. Estivill (2009)
Physiol Rev 89, 887-920
   Abstract »    Full Text »    PDF »
Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes.
F. Hormozdiari, C. Alkan, E. E. Eichler, and S. C. Sahinalp (2009)
Genome Res. 19, 1270-1278
   Abstract »    Full Text »    PDF »
A geometric approach for classification and comparison of structural variants.
S. Sindi, E. Helman, A. Bashir, and B. J. Raphael (2009)
Bioinformatics 25, i222-i230
   Abstract »    Full Text »    PDF »
ABySS: A parallel assembler for short read sequence data.
J. T. Simpson, K. Wong, S. D. Jackman, J. E. Schein, S. J.M. Jones, and I. Birol (2009)
Genome Res. 19, 1117-1123
   Abstract »    Full Text »    PDF »
A homozygous deletion of a normal variation locus in a patient with hearing loss from non-consanguineous parents.
J Knijnenburg, S A J L. Oberstein, K Frei, T Lucas, A C J Gijsbers, C A L Ruivenkamp, H J Tanke, and K Szuhai (2009)
J. Med. Genet. 46, 412-417
   Abstract »    Full Text »    PDF »
Mutation-associated fusion cancer genes in solid tumors.
F. J. Kaye (2009)
Mol. Cancer Ther. 8, 1399-1408
   Abstract »    Full Text »    PDF »
Meiotic recombination generates rich diversity in NK cell receptor genes, alleles, and haplotypes.
P. J. Norman, L. Abi-Rached, K. Gendzekhadze, J. A. Hammond, A. K. Moesta, D. Sharma, T. Graef, K. L. McQueen, L. A. Guethlein, C. V.F. Carrington, et al. (2009)
Genome Res. 19, 757-769
   Abstract »    Full Text »    PDF »
Recovering genome rearrangements in the mammalian phylogeny.
H. Zhao and G. Bourque (2009)
Genome Res. 19, 934-942
   Abstract »    Full Text »    PDF »
Germline CDH1 deletions in hereditary diffuse gastric cancer families.
C. Oliveira, J. Senz, P. Kaurah, H. Pinheiro, R. Sanges, A. Haegert, G. Corso, J. Schouten, R. Fitzgerald, H. Vogelsang, et al. (2009)
Hum. Mol. Genet. 18, 1545-1555
   Abstract »    Full Text »    PDF »
Personalized Medicine: Boon or Budget-Buster?.
C. E Dean (2009)
Ann. Pharmacother. 43, 958-962
   Abstract »    Full Text »    PDF »
Copy number variants, diseases and gene expression.
C. N. Henrichsen, E. Chaignat, and A. Reymond (2009)
Hum. Mol. Genet. 18, R1-R8
   Abstract »    Full Text »    PDF »
Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses.
M. J. Fullwood, C.-L. Wei, E. T. Liu, and Y. Ruan (2009)
Genome Res. 19, 521-532
   Abstract »    Full Text »    PDF »
Next-Generation Sequencing: From Basic Research to Diagnostics.
K. V. Voelkerding, S. A. Dames, and J. D. Durtschi (2009)
Clin. Chem. 55, 641-658
   Abstract »    Full Text »    PDF »
Periventricular heterotopia, mental retardation, and epilepsy associated with 5q14.3-q15 deletion.
C. Cardoso, A. Boys, E. Parrini, C. Mignon-Ravix, J. M. McMahon, S. Khantane, E. Bertini, E. Pallesi, C. Missirian, O. Zuffardi, et al. (2009)
Neurology 72, 784-792
   Abstract »    Full Text »    PDF »
Mapping DNA structural variation in dogs.
W.-K. Chen, J. D. Swartz, L. J. Rush, and C. E. Alvarez (2009)
Genome Res. 19, 500-509
   Abstract »    Full Text »    PDF »
A sequence-level map of chromosomal breakpoints in the MCF-7 breast cancer cell line yields insights into the evolution of a cancer genome.
O. A. Hampton, P. Den Hollander, C. A. Miller, D. A. Delgado, J. Li, C. Coarfa, R. A. Harris, S. Richards, S. E. Scherer, D. M. Muzny, et al. (2009)
Genome Res. 19, 167-177
   Abstract »    Full Text »    PDF »
Population-specific GSTM1 copy number variation.
R. S. Huang, P. Chen, S. Wisel, S. Duan, W. Zhang, E. H. Cook, S. Das, N. J. Cox, and M. E. Dolan (2009)
Hum. Mol. Genet. 18, 366-372
   Abstract »    Full Text »    PDF »
MSB: A mean-shift-based approach for the analysis of structural variation in the genome.
L.-y. Wang, A. Abyzov, J. O. Korbel, M. Snyder, and M. Gerstein (2009)
Genome Res. 19, 106-117
   Abstract »    Full Text »    PDF »
Prevalence in the United States of Selected Candidate Gene Variants: Third National Health and Nutrition Examination Survey, 1991-1994.
M.-h. Chang, M. L. Lindegren, M. A. Butler, S. J. Chanock, N. F. Dowling, M. Gallagher, R. Moonesinghe, C. A. Moore, R. M. Ned, M. R. Reichler, et al. (2009)
Am. J. Epidemiol. 169, 54-66
   Abstract »    Full Text »    PDF »
LINE-Like Retrotransposition in Saccharomyces cerevisiae.
C. Dong, R. T. Poulter, and J. S. Han (2009)
Genetics 181, 301-311
   Abstract »    Full Text »    PDF »
Population Genetic Inference From Resequencing Data.
R. Jiang, S. Tavare, and P. Marjoram (2009)
Genetics 181, 187-197
   Abstract »    Full Text »    PDF »
Aggressive assembly of pyrosequencing reads with mates.
J. R. Miller, A. L. Delcher, S. Koren, E. Venter, B. P. Walenz, A. Brownley, J. Johnson, K. Li, C. Mobarry, and G. Sutton (2008)
Bioinformatics 24, 2818-2824
   Abstract »    Full Text »    PDF »
Drosophila bloom helicase maintains genome integrity by inhibiting recombination between divergent DNA sequences.
M. Kappeler, E. Kranz, K. Woolcock, O. Georgiev, and W. Schaffner (2008)
Nucleic Acids Res. 36, 6907-6917
   Abstract »    Full Text »    PDF »
Sequencing of natural strains of Arabidopsis thaliana with short reads.
S. Ossowski, K. Schneeberger, R. M. Clark, C. Lanz, N. Warthmann, and D. Weigel (2008)
Genome Res. 18, 2024-2033
   Abstract »    Full Text »    PDF »
Analysis of copy number variants and segmental duplications in the human genome: Evidence for a change in the process of formation in recent evolutionary history.
P. M. Kim, H. Y.K. Lam, A. E. Urban, J. O. Korbel, J. Affourtit, F. Grubert, X. Chen, S. Weissman, M. Snyder, and M. B. Gerstein (2008)
Genome Res. 18, 1865-1874
   Abstract »    Full Text »    PDF »
Ultraconserved Elements: Analyses of Dosage Sensitivity, Motifs and Boundaries.
C. W. K. Chiang, A. Derti, D. Schwartz, M. F. Chou, J. N. Hirschhorn, and C.-t. Wu (2008)
Genetics 180, 2277-2293
   Abstract »    Full Text »    PDF »
How Segmental Duplications Shape Our Genome: Recent Evolution of ABCC6 and PKD1 Mendelian Disease Genes.
O. Symmons, A. Varadi, and T. Aranyi (2008)
Mol. Biol. Evol. 25, 2601-2613
   Abstract »    Full Text »    PDF »
Defensins and the dynamic genome: What we can learn from structural variation at human chromosome band 8p23.1.
E. J. Hollox, J. C.K. Barber, A. J. Brookes, and J. A.L. Armour (2008)
Genome Res. 18, 1686-1697
   Abstract »    Full Text »    PDF »
Copy number variation and evolution in humans and chimpanzees.
G. H. Perry, F. Yang, T. Marques-Bonet, C. Murphy, T. Fitzgerald, A. S. Lee, C. Hyland, A. C. Stone, M. E. Hurles, C. Tyler-Smith, et al. (2008)
Genome Res. 18, 1698-1710
   Abstract »    Full Text »    PDF »
Extending genome-wide association studies to copy-number variation.
S. A. McCarroll (2008)
Hum. Mol. Genet. 17, R135-R142
   Abstract »    Full Text »    PDF »
On the frequency of copy number variants.
I. Ionita-Laza, N. M. Laird, B. A. Raby, S. T. Weiss, and C. Lange (2008)
Bioinformatics 24, 2350-2355
   Abstract »    Full Text »    PDF »
RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays.
J. C. Marioni, C. E. Mason, S. M. Mane, M. Stephens, and Y. Gilad (2008)
Genome Res. 18, 1509-1517
   Abstract »    Full Text »    PDF »
From the Cover: Double-strand breaks associated with repetitive DNA can reshape the genome.
J. L. Argueso, J. Westmoreland, P. A. Mieczkowski, M. Gawel, T. D. Petes, and M. A. Resnick (2008)
PNAS 105, 11845-11850
   Abstract »    Full Text »    PDF »
Status and Prospects of Association Mapping in Plants.
C. Zhu, M. Gore, E. S. Buckler, and J. Yu (2008)
The Plant Genome 1, 5-20
   Abstract »    Full Text »    PDF »
Mapping translocation breakpoints by next-generation sequencing.
W. Chen, V. Kalscheuer, A. Tzschach, C. Menzel, R. Ullmann, M. H. Schulz, F. Erdogan, N. Li, Z. Kijas, G. Arkesteijn, et al. (2008)
Genome Res. 18, 1143-1149
   Abstract »    Full Text »    PDF »
Evolutionary dynamics of segmental duplications from human Y-chromosomal euchromatin/heterochromatin transition regions.
S. Kirsch, C. Munch, Z. Jiang, Z. Cheng, L. Chen, C. Batz, E. E. Eichler, and W. Schempp (2008)
Genome Res. 18, 1030-1042
   Abstract »    Full Text »    PDF »
A robust framework for detecting structural variations in a genome.
S. Lee, E. Cheran, and M. Brudno (2008)
Bioinformatics 24, i59-i67
   Abstract »    Full Text »    PDF »
The new paradigm of flow cell sequencing.
R. A. Holt and S. J.M. Jones (2008)
Genome Res. 18, 839-846
   Abstract »    Full Text »    PDF »
Extensive variation between inbred mouse strains due to endogenous L1 retrotransposition.
K. Akagi, J. Li, R. M. Stephens, N. Volfovsky, and D. E. Symer (2008)
Genome Res. 18, 869-880
   Abstract »    Full Text »    PDF »
Copy Number Variation Detection via High-Density SNP Genotyping.
K. Wang and M. Bucan (2008)
CSH Protocols 2008, pdb.top46
   Abstract »    Full Text »
Common Variants in Genes Underlying Monogenic Hypertension and Hypotension and Blood Pressure in the General Population.
M. D. Tobin, M. Tomaszewski, P. S. Braund, C. Hajat, S. M. Raleigh, T. M. Palmer, M. Caulfield, P. R. Burton, and N. J. Samani (2008)
Hypertension 51, 1658-1664
   Abstract »    Full Text »    PDF »
Scanning the human genome at kilobase resolution.
J. Chen, Y. C. Kim, Y.-C. Jung, Z. Xuan, G. Dworkin, Y. Zhang, M. Q. Zhang, and S. M. Wang (2008)
Genome Res. 18, 751-762
   Abstract »    Full Text »    PDF »
Multiplex Sequencing of Pair-End-Ditags (MS-PET) for Cancer Genome Interrogation.
C. L. Wei and Y. Ruan (2008)
Am. Assoc. Cancer Res. Educ. Book 2008, 651-656
   Abstract »    Full Text »    PDF »
Structural Genomic Variation and Personalized Medicine.
C. Lee and C. C. Morton (2008)
N. Engl. J. Med. 358, 740-741
   Full Text »    PDF »



To Advertise     Find Products


Science. ISSN 0036-8075 (print), 1095-9203 (online)