Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.
BioProduction 2008

Site Tools

  • AAAS
  • Subscribe
  • Feedback

Site Search

Search Advanced

Science 13 April 2007:
Vol. 316. no. 5822, pp. 240 - 243
DOI: 10.1126/science.1140462

Reports

Demographic Histories and Patterns of Linkage Disequilibrium in Chinese and Indian Rhesus Macaques

Ryan D. Hernandez,1 Melissa J. Hubisz,2 David A. Wheeler,3 David G. Smith,4,5 Betsy Ferguson,6,7 Jeffrey Rogers,8 Lynne Nazareth,3 Amit Indap,1 Traci Bourquin,3 John McPherson,3 Donna Muzny,3 Richard Gibbs,3 Rasmus Nielsen,9 Carlos D. Bustamante1*

To understand the demographic history of rhesus macaques (Macaca mulatta) and document the extent of linkage disequilibrium (LD) in the genome, we partially resequenced five Encyclopedia of DNA Elements regions in 9 Chinese and 38 captive-born Indian rhesus macaques. Population genetic analyses of the 1467 single-nucleotide polymorphisms discovered suggest that the two populations separated about 162,000 years ago, with the Chinese population tripling in size since then and the Indian population eventually shrinking by a factor of four. Using coalescent simulations, we confirmed that these inferred demographic events explain a much faster decay of LD in Chinese (r2 {approx} 0.15 at 10 kilobases) versus Indian (r2 {approx} 0.52 at 10 kilobases) macaque populations.

1 Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14850, USA.
2 Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA.
3 Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA.
4 Department of Anthropology, Davis, CA, USA.
5 California National Primate Research Center, Davis, CA, USA.
6 Genetics Research and Informatics Program, Oregon National Primate Research Center, Oregon Health and Sciences University, Beaverton, OR 97006, USA.
7 Washington National Primate Research Center, University of Washington, Seattle, WA 98195, USA.
8 Department of Genetics, Southwest Foundation for Biomedical Research, and Southwest National Primate Research Center, San Antonio, TX 78227, USA.
9 Center for Comparative Genomics, Department of Biology, University of Copenhagen, Universitetsparken 15, 2100 Kbh Ø, Denmark.

* To whom correspondence should be addressed. E-mail: cdb28{at}cornell.edu

Rhesus macaques (Macaca mulatta) and humans shared a most recent common ancestor (MRCA) ~25 million years ago (Ma), and our genomes differ at <7% of nucleotide bases (1). Rhesus and humans, therefore, share a large number of fundamental biological characteristics, including many underlying genetic and physiological processes that lead to disease. For this reason, rhesus macaques have become a model organism for vaccine research (2, 3), as well as studies of normal human physiology and disease. Although previous studies of genetic variation in rhesus have described >300 microsatellite polymorphisms (4, 5), identifying specific genetic risk factors for disease requires a much greater resolution of genetic variation across the genome.

The current geographic range of rhesus macaques is larger than any other nonhuman primate, stretching from western India and Pakistan to the eastern shores of China (Fig. 1). Fossil records suggest that the genus Macaca originated in northern Africa approximately 5.5 Ma, followed by migration through the Middle East and into northern India by ~3 Ma (6). By ~2 Ma, macaques had traversed most of China and reached the Indonesian archipelago, where the putative ancestral species of rhesus macaque, M. fascicularis, is thought to have originated (6, 7).


Figure 1 Fig. 1. The current geographic range of rhesus macaques [green, redrawn from (20)] with the inferred demographic history and the sample locations superimposed. The geographic location of the MRCA is based on (4). [View Larger Version of this Image (59K GIF file)]
 

Previous studies of mitochondrial DNA (8), major histocompatibility complex (MHC) alleles (9), and single-nucleotide polymorphisms (SNPs) in gene-linked regions (10) suggest moderate levels of genetic differentiation between captive-born Indian and Chinese rhesus populations. Developing a more thorough understanding of genetic variation within and between these two populations has important implications for biomedical research. For example, when infected with the simian immunodeficiency virus, animals from Chinese populations develop AIDS-like symptoms more slowly than animals from Indian populations (3).

We have identified 1476 SNPs by sequencing >150 kb of DNA across five Encyclopedia of DNA Elements (ENCODE) (1113) regions located on separate autosomal chromosomes in nine captive-born from wild-caught Chinese and 38 captive-born Indian rhesus macaques. The Chinese animals derive from three distinct geographical sites, whereas the Indian animals came from three different colonies in the United States (Fig. 1). Individuals were chosen to represent rhesus macaque populations that are currently being studied by the international community and to minimize relatedness in the sample [with most individuals in the study being unrelated back to the founding of the colony into which they were born, and none having a shared grandparent (13)]. In our sample of 1476 SNPs discovered, only 486 (33%) were shared across both populations, whereas 604 were found only in the Chinese population (61% of 1090 SNPs observed) and 386 were found only in the Indian population (39% of 872 SNPs observed). The frequency distribution of derived mutations across SNPs [using DNA sequence from the ENCODE project for baboon, Papio cynocephalus anubis, to infer the putative ancestral state (13)] shows that the Chinese population harbors an excess of rare SNPs relative to a population of constant size, whereas the Indian population has too few rare and too many intermediate- and high-frequency–derived SNPs (Fig. 2A). The observed disparity in SNP density (7.25 SNPs per kb for Chinese versus 5.8 SNPs per kb for Indian) in the two populations suggests that the effective size of the Chinese population is much larger than the Indian population, given that the Indian sample size is four times as large as that of the Chinese.


Figure 2 Fig. 2. (A) The marginal frequency spectrum of derived mutations for each population (shown as expected proportions in a subsample of 10 chromosomes by integrating over possible configurations of observed and missing data, with the total number of SNPs in parentheses) and the expected distribution under the standard neutral model (SNM) of constant size. (B) A "topographical map" of the joint site-frequency spectrum for the two populations, with darker tones representing frequency pairs with few SNPs, and lighter tones representing frequency pairs with many SNPs. [View Larger Version of this Image (34K GIF file)]
 

We observed a moderate level of population structure between the Indian and Chinese samples, as measured by Wright's FST statistic (average FST = 0.14; SD = 0.11; range = –0.024 to 0.645) (Fig. 3A). Furthermore, the Bayesian clustering program STRUCTURE (14) clearly separates Chinese and Indian individuals when assuming two clusters (Fig. 3B), and considering more clusters does not significantly improve the fit of the model. We found only one Chinese individual with a marginal amount of Indian ancestry (8.5%, sampled from Suzhou) and eight Indian individuals with more than 5% Chinese ancestry [max 16.8%, including animals from all three primate centers (13)]. These low levels of admixture suggest that recurrent migration between the populations has been minimal. Moreover, the two populations were clearly distinguished by principal components analysis (15) along the first two axes of variation (Fig. 3C). Interestingly, the second component also separates one Chinese individual (sampled from Suzhou) from the others, which suggests that further population substructure may exist. Although this individual is not differentiated from other Chinese-origin animals in the STRUCTURE analysis, it may, nonetheless, harbor alleles from an unsampled Chinese subpopulation (i.e., the two wild-caught parents may be from different subpopulations).


Figure 3 Fig. 3. (A) The distribution of FST between Indian and Chinese rhesus, calculated with the average pairwise-difference across each nonoverlapping window (13). (B) STRUCTURE results. Individuals are represented by vertical lines, and sorted by their amount of Chinese ancestry (black vertical line separates animals with Indian and Chinese origins). Colors correspond to the proportion of an individual's ancestry attributable to a given population (blue, Indian; red, Chinese). (C) Principal component 1 (PC1) and PC2 separate Indian from Chinese individuals. PC2 also isolates a single Chinese individual [corresponding to an individual sampled from Suzhou and shown as the fourth individual from the right in (B)]. [View Larger Version of this Image (13K GIF file)]
 

Using maximum likelihood under the assumption that the animals in this study form a random sample from their respective population (13), we fit a two-population demographic model to the joint distribution of SNP frequencies, or site-frequency spectrum, shown in Fig. 2B. Our model suggests that the Chinese population expanded by a factor of 3.3 and separated from the Indian population ~162 thousand years ago (ka) (95% confidence interval, CI = 183 to 132 ka). After separating, the Indian population maintained its ancestral population size until ~51 ka CI = 72 to 21 ka)], when it was reduced by a factor of 4.3. The population genetic model, although a very simplistic approximation to the rich and complex history of the species, fits the data well, as indicated by a goodness-of-fit test (P = 0.133). Coalescent simulations (13) on the basis of the inferred demographic history for Indian and Chinese rhesus macaques suggest that the MRCA of the two populations lived ~1.94 Ma (SE 14 Ky). This estimate places the MRCA of rhesus near the divergence time from M. fascicularis, inferred from mitochondrial DNA to be 1.83 to 5 Ma (16, 17). Moreover, our simulations suggest that the effective size of the ancestral population of rhesus macaques was ~73,070 (SE 231) individuals, implying that the current effective size of the Chinese population is ~239,704, whereas the Indian population is estimated to be ~17,014.

The recent demographic events that caused these differences in effective population sizes of Indian and Chinese rhesus macaques have also had a large impact on linkage disequilibrium (LD). To quantify the extent of LD in Indian and Chinese rhesus macaques, we measured the correlation coefficient (r2) of alleles from frequency-matched SNPs (13, 18). Figure 4 shows substantial differences between the Indian and Chinese rhesus macaque populations, which are more extreme than the patterns observed among humans. For example, within the Indian rhesus population, LD extends much further than LD observed for European humans, whereas the Chinese rhesus population shows little LD, even for SNPs that are physically very close. Coalescent simulations (13) show that the observed patterns of LD are consistent with our inferred demographic history of this species (shown in Fig. 4 as light blue and pink curves for Indian and Chinese rhesus, respectively). However, LD in the Indian population extends slightly further than expected. This observation may be consistent with recent admixture with a Burmese rhesus population not sampled in this study (8), because admixture between populations with allele frequency differences is known to generate long-range LD.


Figure 4 Fig. 4. The decay of LD for Indian and Chinese rhesus macaques versus European and African humans (n = 9 for all samples), along with the decay of LD for 1000 neutral simulations of our inferred demographic history for rhesus macaque. Human data are from three ENCODE regions orthologous to the rhesus data (13, 21). [View Larger Version of this Image (46K GIF file)]
 

In this study, we analyzed noncoding data in rhesus macaques to characterize their underlying demographic history and to quantify the extent of LD relative to humans. The genetic differences that we have observed between Indian and Chinese rhesus macaques are consistent with a recent report on the distribution of SNPs in these populations (10), as well as previous studies of protein coding, microsatellite STR (short tandem repeat), MHC loci, and mitochondrial and Y-chromosome DNA haplotypes (8). Without samples from wild-caught Indian rhesus monkeys, however, these data must be regarded as estimates, because they may reflect a sampling bias toward those macaques that are available for study in the United States as a result of international restrictions on exportation of primates.

Extending these studies to whole-genome association mapping in captive-born animals could be fruitful for identifying genes involved in human diseases. On the basis of the patterns of LD that we have observed, such an association study would likely require many fewer markers to identify common disease-causing variants in rhesus macaques than in humans. Because LD in captive Indian rhesus macaque populations extends much further than in humans, a SNP map with roughly 1 SNP every 35 kb (82,000 SNPs total) would suffice to achieve the same threshold (r2 =0.4) as a marker every 6 kb in humans (13, 19). Furthermore, because LD decays much faster in Chinese rhesus monkeys than in humans, Chinese macaques provide an ideal platform for localizing mutations that are difficult to map in either Indian macaques or humans as a result of extensive LD among candidate mutations in a particular region.


References and Notes

  • 1. Rhesus Macaque Genome Sequencing and Analysis Consortium, Science 316, 222 (2007).[Abstract/Free Full Text]
  • 2. R. A. Weiss, Nature 410, 1035 (2001). [CrossRef] [Medline]
  • 3. B. Ling et al., AIDS 16, 1489 (2002). [CrossRef] [ISI] [Medline]
  • 4. J. Rogers et al., Genomics 87, 30 (2006). [CrossRef] [ISI] [Medline]
  • 5. M. Raveendran et al., Genomics 88, 706 (2006). [CrossRef] [ISI] [Medline]
  • 6. E. Delson, in The Macaques: Studies in Ecology, Behavior, and Evolution, D. D. Lindburg, Ed. (van Nostrand Rheinhold, New York, 1980), pp. 10–30.
  • 7. C. Abegg, B. Thierry, Biol. J. Linn. Soc. 75, 555 (2002). [CrossRef] [ISI]
  • 8. D. G. Smith, J. McDonough, Am. J. Primatol. 65, 1 (2005). [CrossRef] [ISI] [Medline]
  • 9. J. Viray, B. Rolfs, D. G. Smith, Comp. Med. 51, 555 (2001). [ISI] [Medline]
  • 10. B. Ferguson et al., BMC Genom. 8, 43 (2007). [CrossRef]
  • 11. ENCODE Project Consortium, Science 306, 636 (2004).[Abstract/Free Full Text]
  • 12. ENCODE regions were chosen because they have been widely studied across several mammals, including rhesus and baboon.
  • 13. Materials and methods are available as supporting material on Science Online.
  • 14. D. Falush, M. Stephens, J. K. Pritchard, Genetics 164, 1567 (2003).[Abstract/Free Full Text]
  • 15. A. L. Price et al., Nat. Genet. 38, 904 (2006). [CrossRef] [ISI] [Medline]
  • 16. K. Hayasaka, K. Fujii, S. Horai, Mol. Biol. Evol. 13, 1044 (1996).[Abstract]
  • 17. J. C. Morales, D. J. Melnick, J. Hum. Evol. 34, 1 (1998). [CrossRef] [ISI] [Medline]
  • 18. M. A. Eberle, M. J. Rieder, L. Kruglyak, D. A. Nickerson, PLoS Genet. 2, 1319 (2006). [ISI]
  • 19. L. Kruglyak, Nat. Genet. 22, 139 (1999). [CrossRef] [ISI] [Medline]
  • 20. J. Fooden, in The Macaques: Studies in Ecology, Behavior, and Evolution, D. D. Lindburg, Ed. (van Nostrand Rheinhold, New York, 1980), pp. 1–9.
  • 21. HapMap, Nature 437, 1299 (2005). [CrossRef] [Medline]
  • 22. We thank the Yerkes, Oregon, and California National Primate Research Centers for contributing samples, and D. G. Torgerson for comments. Funded by NIH grant RR05090 to D.G.S., NIH RR00163 to B.F., NIH RR015383 to J.R., NSF0516310 to C.D.B., and 1R01HG003229 to C.D.B., R.N., A. G. Clark, and T. Mattise. Trace Index numbers are consecutively numbered from 1664051535 to 1664070335 and can be retrieved using the following query: PROJECT_NAME='ENCODE' STRATEGY= 'Re-sequencing' TRACE_TYPE_CODE='PCR' SPECIES_CODE='MACACA MULATTA'.
Received for publication 26 January 2007. Accepted for publication 16 March 2007.



THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:
No effect of recombination on the efficacy of natural selection in primates.
K. Bullaughey, M. Przeworski, and G. Coop (2008)
Genome Res. 18, 544-554
   Abstract »    Full Text »    PDF »
From the Cover: TRIMCyp expression in Old World primates Macaca nemestrina and Macaca fascicularis.
G. Brennan, Y. Kozyrev, and S.-L. Hu (2008)
PNAS 105, 3569-3574
   Abstract »    Full Text »    PDF »
Sequence Variation in the Primate Dopamine Transporter Gene and Its Relationship to Social Dominance.
C. M. Miller-Butterworth, J. R. Kaplan, J. Shaffer, B. Devlin, S. B. Manuck, and R. E. Ferrell (2008)
Mol. Biol. Evol. 25, 18-28
   Abstract »    Full Text »    PDF »
Empirical Bayes Inference of Pairwise FST and Its Distribution in the Genome.
S. Kitada, T. Kitakado, and H. Kishino (2007)
Genetics 177, 861-873
   Abstract »    Full Text »    PDF »



ADVERTISEMENT
Click Me!

ADVERTISEMENT
Click Me!

To Advertise     Find Products

ADVERTISEMENT

Featured Jobs

Science. ISSN 0036-8075 (print), 1095-9203 (online)