Jump to: Page Content, Section Navigation, Site Navigation, Site Search, Account Information, or Site Tools.
|
|
ReportsDemographic Histories and Patterns of Linkage Disequilibrium in Chinese and Indian Rhesus Macaques
To understand the demographic history of rhesus macaques (Macaca mulatta) and document the extent of linkage disequilibrium (LD) in the genome, we partially resequenced five Encyclopedia of DNA Elements regions in 9 Chinese and 38 captive-born Indian rhesus macaques. Population genetic analyses of the 1467 single-nucleotide polymorphisms discovered suggest that the two populations separated about 162,000 years ago, with the Chinese population tripling in size since then and the Indian population eventually shrinking by a factor of four. Using coalescent simulations, we confirmed that these inferred demographic events explain a much faster decay of LD in Chinese (r2
1 Biological Statistics and Computational Biology, Cornell University, Ithaca, NY 14850, USA. * To whom correspondence should be addressed. E-mail: cdb28{at}cornell.edu
Rhesus macaques (Macaca mulatta) and humans shared a most recent common ancestor (MRCA)
The current geographic range of rhesus macaques is larger than any other nonhuman primate, stretching from western India and Pakistan to the eastern shores of China (Fig. 1). Fossil records suggest that the genus Macaca originated in northern Africa approximately 5.5 Ma, followed by migration through the Middle East and into northern India by
Previous studies of mitochondrial DNA (8), major histocompatibility complex (MHC) alleles (9), and single-nucleotide polymorphisms (SNPs) in gene-linked regions (10) suggest moderate levels of genetic differentiation between captive-born Indian and Chinese rhesus populations. Developing a more thorough understanding of genetic variation within and between these two populations has important implications for biomedical research. For example, when infected with the simian immunodeficiency virus, animals from Chinese populations develop AIDS-like symptoms more slowly than animals from Indian populations (3). We have identified 1476 SNPs by sequencing >150 kb of DNA across five Encyclopedia of DNA Elements (ENCODE) (1113) regions located on separate autosomal chromosomes in nine captive-born from wild-caught Chinese and 38 captive-born Indian rhesus macaques. The Chinese animals derive from three distinct geographical sites, whereas the Indian animals came from three different colonies in the United States (Fig. 1). Individuals were chosen to represent rhesus macaque populations that are currently being studied by the international community and to minimize relatedness in the sample [with most individuals in the study being unrelated back to the founding of the colony into which they were born, and none having a shared grandparent (13)]. In our sample of 1476 SNPs discovered, only 486 (33%) were shared across both populations, whereas 604 were found only in the Chinese population (61% of 1090 SNPs observed) and 386 were found only in the Indian population (39% of 872 SNPs observed). The frequency distribution of derived mutations across SNPs [using DNA sequence from the ENCODE project for baboon, Papio cynocephalus anubis, to infer the putative ancestral state (13)] shows that the Chinese population harbors an excess of rare SNPs relative to a population of constant size, whereas the Indian population has too few rare and too many intermediate- and high-frequencyderived SNPs (Fig. 2A). The observed disparity in SNP density (7.25 SNPs per kb for Chinese versus 5.8 SNPs per kb for Indian) in the two populations suggests that the effective size of the Chinese population is much larger than the Indian population, given that the Indian sample size is four times as large as that of the Chinese.
We observed a moderate level of population structure between the Indian and Chinese samples, as measured by Wright's FST statistic (average FST = 0.14; SD = 0.11; range = 0.024 to 0.645) (Fig. 3A). Furthermore, the Bayesian clustering program STRUCTURE (14) clearly separates Chinese and Indian individuals when assuming two clusters (Fig. 3B), and considering more clusters does not significantly improve the fit of the model. We found only one Chinese individual with a marginal amount of Indian ancestry (8.5%, sampled from Suzhou) and eight Indian individuals with more than 5% Chinese ancestry [max 16.8%, including animals from all three primate centers (13)]. These low levels of admixture suggest that recurrent migration between the populations has been minimal. Moreover, the two populations were clearly distinguished by principal components analysis (15) along the first two axes of variation (Fig. 3C). Interestingly, the second component also separates one Chinese individual (sampled from Suzhou) from the others, which suggests that further population substructure may exist. Although this individual is not differentiated from other Chinese-origin animals in the STRUCTURE analysis, it may, nonetheless, harbor alleles from an unsampled Chinese subpopulation (i.e., the two wild-caught parents may be from different subpopulations).
Using maximum likelihood under the assumption that the animals in this study form a random sample from their respective population (13), we fit a two-population demographic model to the joint distribution of SNP frequencies, or site-frequency spectrum, shown in Fig. 2B. Our model suggests that the Chinese population expanded by a factor of 3.3 and separated from the Indian population The recent demographic events that caused these differences in effective population sizes of Indian and Chinese rhesus macaques have also had a large impact on linkage disequilibrium (LD). To quantify the extent of LD in Indian and Chinese rhesus macaques, we measured the correlation coefficient (r2) of alleles from frequency-matched SNPs (13, 18). Figure 4 shows substantial differences between the Indian and Chinese rhesus macaque populations, which are more extreme than the patterns observed among humans. For example, within the Indian rhesus population, LD extends much further than LD observed for European humans, whereas the Chinese rhesus population shows little LD, even for SNPs that are physically very close. Coalescent simulations (13) show that the observed patterns of LD are consistent with our inferred demographic history of this species (shown in Fig. 4 as light blue and pink curves for Indian and Chinese rhesus, respectively). However, LD in the Indian population extends slightly further than expected. This observation may be consistent with recent admixture with a Burmese rhesus population not sampled in this study (8), because admixture between populations with allele frequency differences is known to generate long-range LD.
In this study, we analyzed noncoding data in rhesus macaques to characterize their underlying demographic history and to quantify the extent of LD relative to humans. The genetic differences that we have observed between Indian and Chinese rhesus macaques are consistent with a recent report on the distribution of SNPs in these populations (10), as well as previous studies of protein coding, microsatellite STR (short tandem repeat), MHC loci, and mitochondrial and Y-chromosome DNA haplotypes (8). Without samples from wild-caught Indian rhesus monkeys, however, these data must be regarded as estimates, because they may reflect a sampling bias toward those macaques that are available for study in the United States as a result of international restrictions on exportation of primates. Extending these studies to whole-genome association mapping in captive-born animals could be fruitful for identifying genes involved in human diseases. On the basis of the patterns of LD that we have observed, such an association study would likely require many fewer markers to identify common disease-causing variants in rhesus macaques than in humans. Because LD in captive Indian rhesus macaque populations extends much further than in humans, a SNP map with roughly 1 SNP every 35 kb (82,000 SNPs total) would suffice to achieve the same threshold (r2 =0.4) as a marker every 6 kb in humans (13, 19). Furthermore, because LD decays much faster in Chinese rhesus monkeys than in humans, Chinese macaques provide an ideal platform for localizing mutations that are difficult to map in either Indian macaques or humans as a result of extensive LD among candidate mutations in a particular region.
THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:
|
Science. ISSN 0036-8075 (print), 1095-9203 (online)