Jump to: Page Content, Section Navigation, Site Navigation, Site Search, Account Information, or Site Tools.
|
|
ReportsGenome of the Host-Cell Transforming Parasite Theileria annulata Compared with T. parva![]() ![]()
Theileria annulata and T. parva are closely related protozoan parasites that cause lymphoproliferative diseases of cattle. We sequenced the genome of T. annulata and compared it with that of T. parva to understand the mechanisms underlying transformation and tropism. Despite high conservation of gene sequences and synteny, the analysis reveals unequally expanded gene families and species-specific genes. We also identify divergent families of putative secreted polypeptides that may reduce immune recognition, candidate regulators of host-cell transformation, and a Theileria-specific protein domain [frequently associated in Theileria (FAINT)] present in a large number of secreted proteins.
1 The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
2 Division of Veterinary Infection and Immunity, Parasitology Group, Institute of Comparative Medicine, Faculty of Veterinary Medicine, Bearsden Road, Glasgow G61 1QH, UK. 3 The International Livestock Research Institute (ILRI), Post Office Box 30709, Nairobi, Kenya. 4 Plate-Forme GénomiquePasteur Génopole, Ile de France Institut Pasteur, 2528 rue du Docteur Roux, 75724 Paris, France. 5 Unité de Recherche Associée CNRS 2581, Département de Parasitologie, Bâtiment Elie Metchnikoff, Institut Pasteur, 2528 rue du Docteur Roux, 75724 Paris Cedex 15, France. 6 European Molecular Biology LaboratoryEuropean Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK. 7 The Institute for Genomic Research (TIGR), 9712 Medical Center Drive, Rockville, MD 20850, USA. 8 Division of Veterinary Clinical Studies, Royal School of Veterinary Studies, Easter Bush Veterinary Centre, Roslin, Midlothian EH25 9RG, UK. 9 Institute of Cell Biology, University of Bern, Baltzerstrasse 4, 3012 Bern, Switzerland. 10 Molecular Pathology, Institute of Animal Pathology, University of Bern, Laenggasstrasse 122, 3012 Bern, Switzerland. 11 Moredun Research Institute, Pentlands Science Park, Bush Loan, Penicuik, Midlothian EH26 0PZ, UK.
* To whom correspondence should be addressed. E-mail: ap2{at}sanger.ac.uk Theileria are the only intracellular eukaryotic pathogens capable of reversibly transforming their host cells. Theileria annulata (TA) and T. parva (TP) are tick-borne hemoparasites (1) that give rise to lymphoproliferative diseases (2) of cattle known, respectively, as tropical theileriosis and East Coast fever (ECF). The molecular mechanisms are unknown, but previous analyses indicate that both species subvert the same host-cell signal transduction pathways (3). Although the parasites have similar life cycles involving intracellular stages in leukocytes and in red blood cells, they are transmitted by different tick species and transform different cell types. In contrast to ECF, cases of tropical theileriosis are accompanied by severe anemia. Available therapeutics are reliable only in the early stages of disease, and existing vaccines rely on the administration of live parasites. There is an urgent need for improved control and therapeutics. The nuclear genome (4) of TA is similar in size (8.35 Mb) to that of TP (8.3 Mb); it spans four chromosomes that range from 1.9 to 2.6 Mb (Table 1 and table S1). We predicted 3792 putative protein-coding genes in TA. In addition, a total of 49 tRNA and 5 ribosomal RNA (rRNA) genes were found, revealing common features in rRNA units between the species (5) (table S1). The telomeres and presumptive centromeres of TA and TP are similar in base composition, size, and arrangement.
Like many parasitic protozoa, both Theileria spp. have tandem arrays of genus-specific, hypervariable gene families (6) (table S3) that map adjacent to the telomeres (6) with an overall arrangement that appears conserved (Fig. 1). Most of these subtelomeric genes encode predicted secreted proteins. Genes previously described as related to the restriction enzyme SfiI fragment (designated family 3, table S3) are found proximal to the telomeres (Fig. 1B), followed by Pro/Gln-rich proteins (family 1, table S3). The boundary between subtelomeric gene families and "housekeeping" genes is defined by adenosine 5'-triphosphatebinding cassette (ABC) transporter genes (family 5, table S3) in the opposite coding orientation. Stage-specific expressed sequence tags (ESTs) indicate that at least three subtelomeric ABC transporters are constitutively transcribed in macroschizont, merozoite, and piroplasm stages in the mammalian host. Members of gene families 3 and 5 also occur internally in the genome. Our findings are consistent with vigorous genetic exchange between subtelomeres, fostering expansion and diversification of antigens, with internal clusters that may act as reservoirs.
The nonsubtelomeric regions of the TA and TP genomes show strong conservation of synteny with only a few inversions of small sequence blocks and no interchromosomal rearrangements (Fig. 1A). Short interruptions to synteny corresponded to the insertion or deletion of genes, and usually involve members of large gene families, as exemplified by the TP repeat (Tpr) genes (4) and their Tpr-related counterparts in TA (Tar). These Tar genes form the second largest family in both genomes. The majority of Tpr genes form a single array on TP chromosome 3 (5, 7), located at a large inversion point. Tar genes are dispersed throughout the four chromosomes in TA and cause small interruptions in synteny. The lower sequence divergence between Tpr compared with Tar genes suggests that they expanded after speciation. The single array in TP may allow gene conversion to prevent divergence.
Noncoding regions of subtelomeres are complex. In TA, from the terminus inward, a succession of paired guanine-cytosine (GC)rich subtelomeric repeats (TaSrpt1 and TaSrpt2) are followed by a single-copy sequence at all chromosome ends (TaSR3; Fig. 1B and fig. S3). No such repeats are found in TP subtelomeres; a terminal sequence (TpSrpt1, We predicted 3265 orthologous genes between the genomes. Most genes without orthologs are members of gene families; only a small proportion (34 in TA, 60 in TP; table S4) are single-copy genes to which functions could not be ascribed, but EST data detected that four of these are expressed in TA. No major species differences were found in the numbers of predicted transcription-associated proteins, peptidases (4), or core metabolic enzymes (5). We evaluated evolutionary pressure acting on genes using the ratio of nonsynonymous to synonymous substitutions (dN/dS) between orthologs (table S7). This method can potentially identify immunogenic genes and thus putative vaccine candidates (8). Where possible, we matched dN/dS with stage-specific expression patterns from the EST data in TA. Constitutively expressed genes displayed the lowest dN/dS values (Fig. 2). Similar to Plasmodium (9), genes encoding merozoite surface proteins yielded the highest dN/dS ratios (Fig. 2); these proteins are candidates for immune selection (10). For predicted macroschizont polypeptides with signal peptides, dN/dS values were also high, although lower than those for merozoites. Surprisingly, genes encoding macroschizont glycosylphosphatidylinositol (GPI)anchored membrane proteins have dN/dS values similar to housekeeping genes. In contrast, high dN/dS ratios were found for macroschizont proteins without predicted membrane retention motifs that are potentially secreted into the leukocyte cytosol. The high dN/dS values associated with host-exported Theileria proteins might reflect regulatory functions that have diversified after speciation of TA and TP. Alternatively, they might reflect exposure to the immune system, after rapid degradation to generate peptides presented by major histocompatibility complex antigens on the infected cell surface. Consistent with this, PEST (a signal for rapid proteolytic degradation) regions (11) were identified in many of these polypeptides (table S8).
Almost all members of the major Theileria-specific subtelomeric protein family members incorporate varying numbers (1 to 54) of a single, highly polymorphic domain with an average length of 70 residues, a designation frequently associated in Theileria (FAINT), formerly known as DUF529 (12). Over 900 copies were found in 166 TA proteins and in equivalent numbers of TP proteins (fig. S5). The majority of the FAINT domaincontaining proteins have no other recognizable domains except a putative signal peptide, consistent with export to the host. However, in members of the TashAT gene cluster, one or more FAINT domains appear with AT-hook and PEST motifs on the same protein (13, 14) (fig. S5 and table S8). We found only one other FAINT domain containing protein in the UniProt protein database (15), occurring in a nontransforming Theileria (synonym of Babesia equi), which also invades leukocytes and develops to a macroschizont stage (16). We also described proteins containing previously unrecognized short amino acid repeat domains in both genomes (4). The species-specific nature of the domains suggests that they have expanded recently (4) (fig. S1).
The parasite genes involved in host-cell transformation must be expressed by the macroschizont stage, and their products must be released into the host cell cytoplasm or expressed on the parasite surface. This would generally require a signal peptide or a specific host-targeting signal sequence. Candidates meeting these criteria include the previously described TashAT and SuAT protein families in TA (13, 14) and related TP host nuclear proteins (TpHNs) in TP. In addition to localizing to the host nucleus, members of the TashAT family bear cyclin-dependent kinase phosphorylation motifs, cyclin docking sites, and AT-hook DNA binding domains (table S8). A cluster of 17 SuAT1- and TashAT-like genes was identified in the TA genome and an orthologous gene family of 20 members in a syntenic region of the TP genome. However, TpHNs lack consensus AT-hook motif, a divergence that could be interpreted as a result of species adaptation to their preferred host-cell type. We screened both predicted proteomes with a database of proteins linked to cell transformation and tumor progression (17) and matched the hits with the presence of a signal peptide and the macroschizont EST data set (4). No obvious proto-oncogenes, kinases, or phosphatases were identified. However, this screen did identify members of the HSP90 subfamily, DEAD-box RNA helicases, peptidases, immunophilins, members of the thioredoxin/glutaredoxin family, and leucine-zipper proteins (table S9). Proteins that function in lipid metabolism were also identified as transformation candidates. First, we found proteins related to phospholipase A2, whose activity is elevated in tumor cells (18), in both predicted proteomes and, unlike in other apicomplexan parasites, they carry a signal peptide. Second, choline kinase genes (ChoKs) are present at high copy number compared with other apicomplexans. ChoK activity is deregulated in transformed cell lines and its inhibition results in a reversible blockage of cell proliferation (19). Finally, the cell cycle effectors uridine phosphorylases and leucine carboxyl methyltransferases (20), whose activity is raised in tumor cells (21), are tandemly duplicated in TA and TP. However, no signal sequence is predicted for the latter three enzymes, so it remains to be determined whether their expansion reflects the ability of the macroschizont to maintain host-cell transformation.
Supporting Online Materialwww.sciencemag.org/cgi/content/full/309/5731/131/DC1 Materials and Methods Figs. S1 to S5 Tables S1 to S9 References
Received for publication 31 January 2005. Accepted for publication 5 May 2005.
THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:
|
||||||||||||||||||||||||||||||||||
Science. ISSN 0036-8075 (print), 1095-9203 (online)