Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.
Mobile DNA in Old World Monkeys: A Glimpse Through the Rhesus Macaque Genome
Kyudong Han,1*Miriam K. Konkel,1*Jinchuan Xing,1*Hui Wang,1*Jungnam Lee,1Thomas J. Meyer,1Charles T. Huang,1Erin Sandifer,1Kristi Hebert,1Erin W. Barnes,1Robert Hubley,2Webb Miller,3Arian F. A. Smit,2Brygg Ullmer,4Mark A. Batzer1
The completion of the draft sequence of the rhesus macaque genomeallowed us to study the genomic composition and evolution oftransposable elements in this representative of the Old Worldmonkey lineage, a group of diverse primates closely relatedto humans. The L1 family of long interspersed elements appearsto have evolved as a single lineage, and Alu elements have evolvedinto four currently active lineages. We also found evidenceof elevated horizontal transmissions of retroviruses and theabsence of DNA transposon activity in the Old World monkey lineage.In addition, 100 precursors of composite SVA (short interspersedelement, variable number of tandem repeat, and Alu) elementswere identified, with the majority being shared by the commonancestor of humans and rhesus macaques. Mobile elements composeroughly 50% of primate genomes, and our findings illustratetheir diversity and strong influence on genome evolution betweenclosely related species.
1 Department of Biological Sciences, Biological Computation and Visualization Center, Center for Bio-Modular Multi-Scale Systems, Louisiana State University, Baton Rouge, LA 70803, USA. 2 Institute for Systems Biology, Seattle, WA98103, USA. 3 Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, PA 16802, USA. 4 Department of Computer Science, Center for Computation and Technology (CCT), Louisiana State University, Baton Rouge, LA70803, USA.
* These authors contributed equally to this work.
Present address: Department of Human Genetics, University ofUtah Health Sciences Center, Salt Lake City, UT 84112, USA
To whom correspondence should be addressed. E-mail: mbatzer{at}lsu.edu
Old World monkeys (OWMs) represent one of the most closely relatedprimate groups to humans. The rhesus macaques (Macaca mulatta),along with other OWMs, have been extensively used in biomedicalstudies (1). An improved understanding of their genomic architecturecould hold important implications for medicine, evolutionaryunderstanding, and beyond. Similar to the human and chimpanzeegenomes, roughly 50% of the rhesus macaque genome consists ofvarious repetitive sequences (24). The majority of theserepeats are mobile elements, which can be divided into classI DNA transposons (5) and class II retrotransposons (6). Relatedtransposable elements are further categorized into families,with each family further classified into subfamilies on thebasis of their sequence relationships. The insertion of mobileelements can alter gene expression (7), generate genomic deletions(8), and even create new genes and gene families (9). Existingrepetitive elements can also mediate recombinations betweensimilar elements at different genomic locations (ectopic recombination)(10). In addition, the GC-rich nature of certain mobile elements{e.g., Alu and SVA [short interspersed element (SINE), variablenumber of tandem repeat (VNTR), and Alu] elements} can introducenew GC islands through their insertion (3). Despite the overallsimilarity in retrotransposon mobilization activity in the OWMand hominoid (human and ape) lineages, mobile elements havecontinued to evolve independently in both lineages. Close examinationof the overall mobile-element composition in OWMs, with therhesus macaque genome used as a reference, allows an understandingof their lineage-specific expansion and illustrates their overallcontribution to genome evolution.
Without any detected lineage-specific copies, DNA transposons,which mobilize through a cut-and-paste mechanism, appear tohave been inactive in the rhesus macaque lineage since theirspeciation from humans. The paucity of DNA transposon mobilizationin mammals, and in amniotes in general, is noteworthy by comparisonwith other organisms (e.g., plants) and may result from therelative difficulty in horizontal transfer into animals' germlines (11).
Similar to the human genome, the rhesus macaque genome containsover half a million recognizable copies of endogenous retroviruses(ERVs) and their nonautonomous derivatives, with the great majoritybeing present or fixed before the hominoid-OWM split (12). Wefound evidence for at least eight instances of horizontal transmissionof ERVs in the OWM lineage resulting in 2750 extant copies (tableS1 and SOM Text). This is much higher than in the human lineage,where there is evidence for only one or two invading elementsleaving fewer than 10 extant copies (13). Five of the eighthorizontally transmitted ERVs belong to class I retroviruses,and the remaining three belong to class II retroviruses (shownin red letters in Fig. 1). Apart from these new invasions, atleast seven ERV families already entered the genome before thehominoid-OWM split and remained active afterward. There areover 3500 copies of these ERV subfamilies in the OWM lineage,similar to the number of lineage-specific ERV copies in humans.
Fig. 1. Phylogenetic tree of retroviruses based on full-length Pol proteins. Common infectious retroviruses and endogenous retroviruses, present in fish, birds, mammals (nonprimate), and primates, were included in the analysis. Color identifications for each group are shown in the upper right corner. Asterisks and circles show deep-rooted branches with >95 and >75% bootstrap values, respectively. The ERVs identified in this study that invaded the OWM genome horizontally (i.e., through external germline infection) are indicated with red letters. For all ERVs shown in blue letters, the original insertion occurred in the common ancestor of humans and rhesus macaques (i.e., vertically) and is present in both genomes. All ERVs indicated with blue letters also generated new insertions in the OWM lineage. The scale bar indicates 10% divergence in the amino acid sequence.
[View Larger Version of this Image (97K GIF file)]
The L1PA (primate A) family of long interspersed elements (LINES)represents the dominant active L1 lineage throughout primateevolution. In our analysis, L1PA5 was the most commonly recoveredL1 subfamily, and 19,000 L1PA5 elements specific to the OWMlineage were identified in the rhesus macaque genome. Most ofthese elements represent insertion events that occurred alongthe OWM lineage leading to rhesus macaques and are thereforepresent in multiple OWM species (fig. S2). A total of 32 OWM-specificL1 subfamilies were identified with the use of diagnostic substitutionspresent in these elements (table S2). To investigate the relationshipof L1s, we constructed a median-joining network with their consensussequences (Fig. 2 and SOM Text) and estimated the age of eachsubfamily (table S2). The network results indicated that theOWM-specific L1 lineage rooted with the L1PA6 consensus sequence,and several lineages roughly followed a sequential order, withlittle overlap in their amplification period. The sequentialevolution of L1 elements appears to follow a general trend seenin mammalian L1s (14) and may result from amplification competitionbetween two distinct L1 lineages (15). Altogether, we identifiednine putative retrotransposition-competent L1s in the rhesusmacaque genome, and they belonged to the L1CER-3 or L1CER-4subfamilies; each L1 subfamily name is identified by "CER" (whichstands for Cercopithecidae, indicating the origin of the consensussequence) and an Arabic numeral indicating its lineage (12).Nine was a considerably lower number of potentially active L1elements than that in the human genome, which has 80 to 100active copies (16). Nevertheless, it is likely that additionalretrotransposition-competent L1 elements will be recovered inmore refined drafts of the rhesus macaque genome.
Fig. 2. Median-joining network of OWM-specific L1 subfamilies. Subfamilies are represented by circles, with the circle size symbolizing the relative size of each subfamily. The length of the lines corresponds to the number of substitutions. The scale of a single substitution is shown in the upper left corner. Broken lines indicate segments not drawn to scale. Gray circles represent the subfamilies belonging to the L1CER-3 lineage, which include an 18base pair (bp) duplication in their 3' untranslated region (3'UTR), and green-edged circles contain intact full-length L1 elements. The dashed line and red arrow represent two alternative pathways for the origin of the L1CER-4 subfamily. The subfamilies in the blue and pink ovals share the same diagnostic mutations but do not share the 18-bp duplication. My, million years.
[View Larger Version of this Image (12K GIF file)]
Retrotransposon-mediated DNA sequence transduction is a processwhereby a retrotransposon carries a flanking genomic sequenceduring its mobilization that can result in exon or gene duplication(17). Three L1 elements with 5' transduced exonderivedsequences were identified in the rhesus macaque genome. Moreover,detailed analysis indicated that one of the three insertionsoccurred in an exon of another gene (table S3 and SOM Text).These three events empirically demonstrate that exon-derivedsequences can be transferred via 5' L1mediated transductionwithin primate genomes and that 5' transduction constitutesa second mechanism of retrotransposon-mediated "exon shuffling."
Alu elements are the most successful SINEs in primate genomes(18), and 110,000 Alu insertions are specific to OWMs. Fourteendifferent OWM lineagespecific AluY subfamilies fell intofour lineages, shown in a median-joining network analysis (Fig. 3),and were identified with estimated copy numbers (table S4).All subfamilies were estimated to have originated after thehominoid-OWM divergence and were congruent with our phylogeneticanalyses showing that all of these Alu subfamilies were restrictedto OWMs (SOM Text). The simultaneous retrotransposition activityof multiple Alu subfamilies is similar to that in the humangenome, and the activity of multiple "source genes" may havecontributed to the amplification success of Alu elements despitetheir reliance on L1 enzymatic machinery for mobilization (19).
Fig. 3. Median-joining network of OWM-specific Alu subfamilies. Subfamilies are represented by circles. The length of the lines corresponds to the number of substitutions, and the scale of a single substitution is shown in the upper left corner. Broken lines indicate segments not drawn to scale. Gray circles represent all subfamilies belonging to the AluYRb lineage containing a 12-bp deletion. Red-edged circles denote the youngest Alu subfamily within each lineage, and the blue-edged circle indicates the AluY subfamily consensus sequence.
[View Larger Version of this Image (18K GIF file)]
About 100 precursors of SVA were identified in the rhesus macaquegenome. The variable number of tandem repeat (VNTR) regionsof these elements share >90% identity with the VNTR unitin hominoid SVA elements (20), although they have no sequencehomology with other components of SVA elements. Thus, theseelements appear to have contributed a portion of the geneticmaterial required to form the SVA composite retrotransposonfamily in hominoids. The majority of these elements are sharedbetween human and rhesus macaque, indicating that these elementswere active before the divergence of hominoids and OWMs. Thelow number of lineage-specific elements (20 in the OWM lineage)suggests a very low retrotransposition rate of SVA precursorelements over the past 25 million years.
Composing nearly half of all sequenced primate genomes, mobileelementsespecially retrotransposonsare major componentsof genomic variation and a driving force of primate evolution.Although the overall number of mobile elements is similar inthe human, chimpanzee, and rhesus macaque genomes (24),a large fraction of the elements inserted independently intodifferent locations within each genome and thus shaped the genomesdifferently (21). Whereas most retrotransposon insertions remainneutral in the genome, many insertions can have deleteriouseffects of varying severity. Mobile elements can cause geneticdiseases not only by direct gene disruption or by the deletionof exonic sequence upon insertion but also by mediating subsequentrecombination between existing retrotransposons. Indeed, morethan 118 human genetic disorders are caused by retrotransposons,including hemophilia B, breast cancers, and congenital musculardystrophy [see (22) and (23) for reviews]; they are likely tohave a similar impact on the rhesus genome. Yet, retrotransposonsare also responsible for creating a variety of genomic novelties.They are involved in mediating gene duplication, exon shuffling,and RNA-editingmediated exonization (9, 17, 24). Allthese mechanisms can contribute to new gene formation, as wellas potentially altering DNA methylation patterns and contributingto X chromosome inactivation in females (25, 26). In addition,retrotransposons provide highly valuable genetic systems forprimate population and phylogenetic studies, because they havea known ancestral (i.e., insertion-absent) state, and the chancethat the same type of element would integrate at precisely thesame location in multiple individuals is essentially zero (i.e.,the insertions are identical by descent) (27, 28). Altogether,understanding the mobile-element landscape in primates is notonly important for biologists but also crucial for biomedicalresearchers using primate animal models.
27. A. M. Shedlock, K. Takahashi, N. Okada, Trends Ecol. Evol.19, 545 (2004). [CrossRef] [Medline]
28. D. A. Ray, J. Xing, A. H. Salem, M. A. Batzer, Syst. Biol.55, 928 (2006).[Abstract/Free Full Text]
29. Thanks to the RMGSAC for the rhesus macaque genome sequence and to S. Brandt, W. Scullin, and S. White for computational support. This project was facilitated in part by high-performance computing allocations from Louisiana State University CCT and supported by the NSF grants BCS-0218338 (M.A.B.) and EPS-0346411 (M.A.B.), NIH GM59290 (M.A.B.), and the State of Louisiana Board of Regents Support Fund (M.A.B.).
Received for publication 3 January 2007. Accepted for publication 16 March 2007.
The editors suggest the following Related Resources on Science sites:
In Science Magazine
INTRODUCTION TO SPECIAL ISSUE
Laura M. Zahn, Barbara R. Jasny, Elizabeth Culotta, and Elizabeth Pennisi (13 April 2007) Science316 (5822), 215.
[DOI: 10.1126/science.316.5822.215] |Summary »|PDF »
THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:
A Broadscale Phylogenetic Analysis of Group II Intron RNAs and Intron-Encoded Reverse Transcriptases.
D. M. Simon, S. A. Kelchner, and S. Zimmerly (2009)
Mol. Biol. Evol.
26, 2795-2808
|Abstract »|Full Text »|PDF »
5'-Transducing SVA retrotransposon groups spread efficiently throughout the human genome.
A. Damert, J. Raiz, A. V. Horn, J. Lower, H. Wang, J. Xing, M. A. Batzer, R. Lower, and G. G. Schumann (2009)
Genome Res.
19, 1992-2008
|Abstract »|Full Text »|PDF »
Mobile elements create structural variation: Analysis of a complete human genome.
J. Xing, Y. Zhang, K. Han, A. H. Salem, S. K. Sen, C. D. Huff, Q. Zhou, E. F. Kirkness, S. Levy, M. A. Batzer, et al. (2009)
Genome Res.
19, 1516-1526
|Abstract »|Full Text »|PDF »
Tracking the past: Interspersed repeats in an extinct Afrotherian mammal, Mammuthus primigenius.
Comparative analysis of Alu repeats in primate genomes.
G. E. Liu, C. Alkan, L. Jiang, S. Zhao, and E. E. Eichler (2009)
Genome Res.
19, 876-885
|Abstract »|Full Text »|PDF »
Comparative Genomics and Molecular Dynamics of DNA Repeats in Eukaryotes.
G.-F. Richard, A. Kerrest, and B. Dujon (2008)
Microbiol. Mol. Biol. Rev.
72, 686-727
|Abstract »|Full Text »|PDF »
From the Cover: Repeated horizontal transfer of a DNA transposon in mammals and other tetrapods.
J. K. Pace II, C. Gilbert, M. S. Clark, and C. Feschotte (2008)
PNAS
105, 17023-17028
|Abstract »|Full Text »|PDF »
CpG dinucleotides and the mutation rate of non-CpG DNA.
J.-C. Walser, L. Ponger, and A. V. Furano (2008)
Genome Res.
18, 1403-1414
|Abstract »|Full Text »|PDF »
Evolutionary and Biomedical Insights from the Rhesus Macaque Genome.
Rhesus Macaque Genome Sequencing and Analysis Cons, R. A. Gibbs, J. Rogers, M. G. Katze, R. Bumgarner, G. M. Weinstock, E. R. Mardis, K. A. Remington, R. L. Strausberg, J. C. Venter, et al. (2007)
Science
316, 222-234
|Abstract »|Full Text »|PDF »