|
|
Science 20 July 2001: Vol. 293. no. 5529, pp. 498 - 506 DOI: 10.1126/science.1061217
|
|

|
Complete Genome Sequence of a Virulent Isolate of Streptococcus pneumoniae
Hervé Tettelin, Karen E. Nelson, Ian T. Paulsen, Jonathan A. Eisen, Timothy D. Read, Scott Peterson, John
Heidelberg, Robert T. DeBoy, Daniel H. Haft, Robert J. Dodson, A. Scott Durkin, Michelle Gwinn, James F.
Kolonay, William C. Nelson, Jeremy D. Peterson, Lowell A. Umayam, Owen White, Steven L. Salzberg,
Matthew R. Lewis, Diana Radune, Erik Holtzapple, Hoda Khouri, Alex M. Wolf, Terry R. Utterback, Cheryl L.
Hansen, Lisa A. McDonald, Tamara V. Feldblyum, Samuel Angiuoli, Tanja Dickinson, Erin K. Hickey, Ingeborg
E. Holt, Brendan J. Loftus, Fan Yang, Hamilton O. Smith, J. Craig Venter, Brian A. Dougherty, Donald A.
Morrison, Susan K. Hollingshead, and Claire M. Fraser
|
Supplementary Material
Supplemental Figure 1. Linear representation of the S. pneumoniae TIGR4 genome. The location of predicted coding regions color-coded by biological role (see Fig. 1) is displayed, as well as rRNA and tRNAs genes. Arrowed boxes represent the direction of transcription for each ORF. Thin arrows represent IS elements. Numbers next to the tRNA symbols represent the number of tRNAs at a locus. Numbers next to GES regions represent the number of membrane-spanning domains predicted by TopPred (displayed only for ORF products with five or more predicted membrane spanning regions). Transcriptional terminators are represented by hairpins.

Medium version | Full size version
Supplemental Figure 2. Comparison of the S. pneumoniae ORFs to those of other completely sequenced genomes. All ORFs were searched with FASTA3 against all ORFs from other complete genomes including those of plasmids, organelles and phages. The number of S. pneumoniae ORFs whose highest similarity (P < 10-5) is to an ORF from a given species is shown. Abbreviations: LACLA, Lactobacillus lactis; BACHA, Bacillus halodurans; BACSU, Bacillus subtilis; STAAU, Staphylococcus aueus; ECOLI, Escherichia coli; ECO157H7, Escherichia coli O157H7; PASMU, Pasteurella multocida; HAEIN, Haemophilus influenzae; THEMA, Thermotoga maritima; SYNSP, Synechocystis sp.; PORGI, Porphyromonas gingivalis; NEIMEa, Neisseria meningitidis serogroup A; VIBCH, Vibrio cholera; METJA, Methanococcus jannaschii; PSEAE, Pseudomonas aeruginosa; PYRFU, Pyrococcus furiosus; CAUCR, Caulobacter crescentus; DEIRA, Deinococcus radiodurans; MYCTU, Mycobacterium tuberculosis; CAMJE, Campylobacter jejuni; NEIMEb, Neisseria meningitidis serogroup B; ARCFU, Archaeoglobus fulgidus; BBUR, Borrelia burgdoferi; GEOSU, Geobacter sulfurreducens; HELP99, Helicobacter pylori J99; HELPY, Helicobacter pylori 26695; PYRHO, Pyrococcus horikoshii; UREUR, Ureaplasma urealyticum; AQUAE, Aquifex aeolicus; CELEG, Caenorhabditis elegans; METTH, Methanobacterium thermoautotrophicum; PLASMID_lacla.pMRC01, Lactobacillus lactis plasmid; PYRAB, Pyrococcus abyssii; XYLFA, Xylella fastidiosa; YEAST, Saccharomyces cerevisiae; ARATH, Arabidopsis thaliana; CHLTE, Chlamydia trachomatis; PHAGE_streptococcus_thermophilus_720, Streptococcus thermophilus phage; THEAC, Thermoplasma acidophilum; TREPA, Treponema pallidum; DROME, Drosophila melanogaster; HALSP, Halobacterium sp.; HUMAN, Homo sapiens; MYCGE, Mycoplasma genitalium.

Medium version | Full size version
Supplemental Figure 3. Organization of the polymorphic type I restriction enzyme (hsdS) operon (SP0505-SP0510). SP0505 and SP0507 are partial hsdS (specificity subunit) genes. SP0506 is an integrase gene, SP0508-SP0510 are the specifity, modification and restriction subunits. 'A' (thick bar) is an inverted 85 bp repeat, 'B' (thin bar) is an inverted 15 bp repeat. 'A' and 'B' share the core sequence similarity ATTATGGGAA. Clones were sequenced that were fusion of the hsdS and the hsdS' pseudogenes with the boundary being either the 'A' or the 'B' repeat.

Medium version | Full size version
Supplemental Figure 4. Structure of the 21 PTS transporters. Schematic representation of the PTS Enzyme II gene clusters in S. pneumoniae. Each gene is indicated with an arrow indicating the direction of transcription, and the gene number is provided below the line. The regions of each gene encoding PTS domains are color-coded (IIA, blue; IIB, red; IIC, green; and IID, magenta). Additionally, non-PTS genes are indicated in black (transcriptional regulator) or orange (sugar hydrolase). Genes flanking the PTS genes that may be co-transcribed are not shown. For each Enzyme II cluster the probable substrate specificity is provided to the right, a question mark indicates the assignment is speculative, where a substrate specificity assignment was not possible, the PTS family of the Enzyme II constituents is indicated.

Medium version | Full size version
| Supplemental Table 1. S. pneumoniae lineage-specific duplications (a). Predicted proteins were grouped into clusters if at least one member of a cluster is more similar to another member of the cluster than to any other gene. |
| Cluster | First gene in cluster |
| SP0137 SP1342 | ABC transporter, ATP-binding protein |
| SP0687 SP1957 SP1987 | ABC transporter, ATP-binding protein |
| SP1341 SP0912 | ABC transporter, ATP-binding protein |
| SP0148 SP0620 | ABC transporter, substrate-binding protein |
| SP1826 SP0243 | ABC transporter, substrate-binding protein |
| SP1002 SP2169 | adhesion lipoprotein |
| SP0112 SP1394 | amino acid ABC transporter, periplasmic amino acid-binding protein, putative |
| SP0533 SP0041 SP0541 | bacteriocin BlpK |
| SP0524 SP1281 | BlpT protein, fusion |
| SP0529 SP0043 | BlpC ABC transporter |
| SP0527 SP2236 | sensor histidine kinase BlpH, putative |
| SP0536 SP0543 | immunity protein BlpL |
| SP0526 SP2235 | response regulator BlpR |
| SP0377 SP0391 SP0390 SP0378 | choline binding protein C |
| SP0440 SP1770 SP1765 SP1766 SP1767 SP1771 SP1764 SP1365 | glycosyl transferase, degenerate |
| SP0658 SP0999 | cytochrome c-type biogenesis protein CcdA |
| SP0603 SP2193 | DNA-binding response regulator VncR |
| SP1089 SP2072 | glutamine amidotransferase, class I |
| SP0301 SP0578 | glycosyl hydrolase, family 1, truncation |
| SP0834 SP1923 | hemolysin-related protein |
| SP0712 SP0715 | lactate oxidase, truncation |
| SP1273 SP1274 SP1367 | licD1 protein |
| SP1235 SP0740 | MutT/nudix family protein |
| SP1330 SP1685 | N-acetylmannosamine-6-P epimerase, putative |
| SP1326 SP1687 | neuraminidase, putative |
| SP1325 SP1686 | oxidoreductase, Gfo/Idh/MocA family |
| SP1471 SP1472 | oxidoreductase, putative |
| SP1427 SP1429 | peptidase, U32 family |
| SP1359 SP0660 | peptide methionine sulfoxide reductase |
| SP1700 SP1701 | phospho-2-dehydro-3-deoxyheptonate aldolase |
| SP1331 SP1674 | phosphosugar-binding transcriptional regulator, putative |
| SP0117 SP2190 SP0930 SP2136 SP2201 SP0069 | pneumococcal surface protein A |
| SP0667 SP1573 SP0965 | pneumococcal surface protein, putative |
| SP1061 SP1612 | protein kinase, putative |
| SP0645 SP1198 SP1199 | PTS system IIA component, putative |
| SP0248 SP0308 | PTS system, IIA component |
| SP0064 SP0284 | PTS system, IIA component |
| SP0646 SP1197 | PTS system, IIB component, putative |
| SP0062 SP2162 | PTS system, IIC component |
| SP0860 SP2060 | pyrrolidone-carboxylate peptidase |
| SP0680 SP0280 | ribosomal small subunit pseudouridine synthase A |
| SP1324 SP1675 | ROK family protein |
| SP0604 SP2192 | sensor histidine kinase VncS |
| SP0466 SP0467 | sortase, putative |
| SP0659 SP1000 | thioredoxin family protein |
| SP0141 SP1115 | transcriptional regulator |
| SP2006 SP0014 | transcriptional regulator ComX2 |
| SP0163 SP1057 SP1946 | transcriptional regulator PlcR, putative |
| SP1989 SP2090 | transcriptional regulator PlcR, putative |
| SP0584 SP1144 | transcriptional regulator, putative |
| SP0676 SP0927 | transcriptional regulator, putative |
| SP1858 SP2234 | transcriptional regulator, TetR family |
| SP1615 SP2030 | transketolase, authentic frameshift |
| SP0599 SP0601 | transmembrane protein Vexp1 |
| SP0886 SP0509 | type I restriction-modification system, M subunit, putative |
| SP0892 SP0510 | type I restriction-modification system, R subunit, putative |
| SP0505 SP0508 | type I restriction-modification system, S subunit, putative |
| SP0887 SP0891 | type I restriction-modification system, S subunit, putative |
| SP0664 SP1154 SP0071 | zinc metalloprotease ZmpB, putative |
| SP0143 SP0144 SP0925 SP0545 | conserved domain protein |
| SP1136 SP1575 | conserved domain protein |
| SP1332 SP1346 | conserved domain protein |
| SP0145 SP0379 | conserved hypothetical protein |
| SP0619 SP1637 | conserved hypothetical protein |
| SP0686 SP1988 SP1956 | conserved hypothetical protein |
| SP1003 SP1174 SP1175 SP1004 | conserved hypothetical protein |
| SP1327 SP1691 SP1680 | conserved hypothetical protein |
| SP1334 SP1348 | conserved hypothetical protein |
| SP0108 SP0114 | hypothetical protein |
| SP0153 SP1436 | hypothetical protein |
| SP0164 SP1058 | hypothetical protein |
| SP0455 SP0917 | hypothetical protein |
| SP0733 SP1487 SP0810 SP1302 | hypothetical protein |
| SP0734 SP0809 SP1303 | hypothetical protein |
| SP0833 SP2159 | hypothetical protein |
| SP1109 SP0906 | hypothetical protein |
| SP1333 SP1347 | hypothetical protein |
| SP1335 SP1349 | hypothetical protein |
| SP1480 SP0031 SP1481 | hypothetical protein |
| SP1488 SP0087 | hypothetical protein |
| SP1531 SP1805 | hypothetical protein |
| SP1707 SP1708 | hypothetical protein |
| SP1262 SP1444 SP1639 SP1792 SP0028 SP1692 | IS1167, transposase |
| SP0361 SP0836 SP1582 SP0460 | IS1167, transposase, truncation |
| SP0328 SP1418 SP0343 SP0495 SP0714 SP1337 SP1352 SP1439 SP1503 SP1595 SP2089 SP2179 | IS1380-Spn1, transposase |
| SP0537 SP1927 SP0900 SP2137 SP2080 | IS1381, transposase OrfA |
| SP0942 SP1310 | IS1381, transposase OrfA |
| SP1729 SP1928 SP0039 SP2138 | IS1381, transposase OrfA/OrfB, truncation |
| SP0538 SP1195 SP0941 SP1086 SP2079 | IS1381, transposase OrfB |
| SP2154 SP1596 | IS3-Spn1, hypothetical protein, truncation |
| SP0299 SP0345 SP0818 SP0995 | IS630-Spn1, transposase Orf1, authentic frameshift |
| SP0132 SP1149 SP0015 SP2015 SP0086 | IS630-Spn1, transposase Orf1, degenerate |
| SP0300 SP1148 SP0344 SP0819 SP0996 | IS630-Spn1, transposase Orf2 |
| SP1314 SP1443 | IS66 family element, Orf1 |
| SP0363 SP0811 SP2212 SP0643 | transposase family protein, truncation |
| SP1064 SP1622 | transposase, IS200 family |
| SP1496 SP2014 SP0016 | transposase, IS630-Spn1 related, Orf2 |
| (a) The extent of potential lineage specific gene duplications in this genome was estimated by identification of ORFs that are more similar to other ORFs within the TIGR4 genome than to ORFs from other complete genomes including those of plasmids, organelles, and phages. All ORFs were searched with FASTA3 against all ORFs from the complete genomes and matches with a FASTA p value of 10-5 were considered significant. |
| Supplemental Table 2. ORFs that do not have a 10-5 E value match in other low-GC Gram-positive species (a). |
| ORF | Description |
| SP0015 | IS630-Spn1, transposase Orf1 |
| SP0016 | IS630-Spn1, transposase Orf2 |
| SP0024 | conserved hypothetical protein |
| SP0060 | beta-galactosidase |
| SP0065 | sugar isomerase domain protein AgaS |
| SP0075 | phosphorylase, Pnp/Udp family |
| SP0086 | IS630-Spn1, transposase Orf1, truncation |
| SP0159 | conserved hypothetical protein |
| SP0276 | conserved hypothetical protein |
| SP0298 | conserved hypothetical protein |
| SP0299 | IS630-Spn1, transposase Orf1, authentic frameshift |
| SP0300 | IS630-Spn1, transposase Orf2 |
| SP0304 | conserved hypothetical protein |
| SP0318 | carbohydrate kinase, PfkB family |
| SP0344 | IS630-Spn1, transposase Orf2 |
| SP0345 | IS630-Spn1, transposase Orf1, authentic frameshift |
| SP0390 | choline binding protein G |
| SP0409 | conserved hypothetical protein |
| SP0481 | conserved hypothetical protein |
| SP0574 | hypothetical protein |
| SP0584 | transcriptional regulator, putative |
| SP0606 | oxidoreductase, putative |
| SP0628 | HIT family protein |
| SP0637 | membrane protein |
| SP0638 | conserved hypothetical protein |
| SP0641 | serine protease, subtilase family |
| SP0695 | HesA/MoeB/ThiF family protein |
| SP0751 | branched-chain amino acid ABC transporter, permease protein |
| SP0795 | PEP-utilizing enzymes family protein |
| SP0818 | IS630-Spn1, transposase Orf1, authentic frameshift |
| SP0819 | IS630-Spn1, transposase Orf2 |
| SP0858 | membrane protein, putative |
| SP0859 | membrane protein |
| SP0887 | type I restriction-modification system, S subunit, putative |
| SP0892 | type I restriction-modification system, R subunit, putative |
| SP0907 | capsular polysaccharide biosynthesis protein, putative |
| SP0930 | choline binding protein E |
| SP0939 | conserved hypothetical protein |
| SP0962 | lactoylglutathione lyase |
| SP0965 | endo-beta-N-acetylglucosaminidase |
| SP0977 | tellurite resistance protein TehB |
| SP0995 | IS630-Spn1, transposase Orf1, authentic frameshift |
| SP1063 | ABC-2 transporter, permease protein, putative |
| SP1068 | phosphoenolpyruvate carboxylase |
| SP1069 | conserved hypothetical protein |
| SP1077 | conserved domain protein |
| SP1143 | conserved hypothetical protein |
| SP1144 | conserved hypothetical protein |
| SP1148 | IS630-Spn1, transposase Orf2 |
| SP1149 | IS630-Spn1, transposase Orf1 |
| SP1222 | type II restriction endonuclease, putative |
| SP1240 | conserved hypothetical protein |
| SP1251 | endonuclease, putative |
| SP1261 | conserved hypothetical protein |
| SP1264 | conserved domain protein |
| SP1268 | licB protein |
| SP1269 | choline kinase |
| SP1270 | alcohol dehydrogenase, zinc-containing |
| SP1315 | v-type sodium ATP synthase, subunit D |
| SP1319 | v-type sodium ATP synthase, subunit C |
| SP1321 | v-type sodium ATP synthase, subunit K |
| SP1322 | v-type sodium ATP synthase, subunit I |
| SP1326 | neuraminidase, putative |
| SP1327 | conserved hypothetical protein |
| SP1343 | prolyl oligopeptidase family protein |
| SP1344 | conserved hypothetical protein |
| SP1350 | conserved domain protein |
| SP1428 | conserved hypothetical protein |
| SP1431 | type II DNA modification methyltransferase, putative |
| SP1442 | IS66 family element, Orf2 |
| SP1492 | cell wall surface anchor family protein |
| SP1496 | transposase, IS630-Spn1 related, Orf2 |
| SP1543 | conserved hypothetical protein, authentic point mutation |
| SP1546 | conserved domain protein |
| SP1547 | conserved hypothetical protein |
| SP1549 | polypeptide deformylase |
| SP1550 | glutathione S-transferase family protein |
| SP1600 | hypothetical protein |
| SP1680 | conserved hypothetical protein |
| SP1687 | neuraminidase B |
| SP1691 | conserved hypothetical protein |
| SP1740 | conserved hypothetical protein |
| SP1765 | glycosyl transferase, family 8 |
| SP1768 | conserved hypothetical protein |
| SP1770 | glycosyl transferase, family 8 |
| SP1783 | MutT/nudix family protein |
| SP1809 | transcriptional regulator |
| SP1826 | ABC transporter, substrate-binding protein |
| SP1827 | hypothetical protein |
| SP1850 | type II restriction endonuclease DpnI |
| SP1851 | conserved hypothetical protein |
| SP1894 | sucrose phosphorylase |
| SP1916 | PAP2 family protein |
| SP2014 | IS630-Spn1, transposase Orf2 |
| SP2015 | IS630-Spn1, transposase Orf1 |
| SP2017 | membrane protein |
| SP2027 | conserved hypothetical protein |
| SP2031 | conserved hypothetical protein |
| SP2037 | PTS system, IIB component |
| SP2063 | LysM domain protein, authentic frameshift |
| SP2081 | conserved hypothetical protein |
| SP2122 | conserved hypothetical protein |
| SP2146 | conserved hypothetical protein |
| SP2158 | L-fucose isomerase |
| SP2165 | fucose operon FucU protein |
| (a) All ORFs were searched with FASTA3 against all ORFs from other complete genomes including those of plasmids, organelles and phages. Matches with a FASTA p value of 10-5 were considered significant. |
| Supplemental Table 3. Genes involved in competence. |
| Operon | ORF | Gene name | Induced (a) | Required (b) | Function (c) | Closest relatives (d, e) | Other relatives (e, f) |
| 1 | SP0954 | celA | + | + | DNA binding | BS BH e-20 | TM e-7 |
| 1 | SP0955 | celB | + | + | DNA uptake pore | BS BH e-18 | MG e-7 |
| 2 | SP0978 | coiA | + | + | Unknown | BH BS e-12 | - |
| 3 | SP1266 | dal, cilB | + | + | DNA processing | BH e-31 | many e-25 |
| 4 | SP1808 | cclA, cilC | + | + | Prepilin processing peptidase | AQ e-10 | TM SS NM DR BS e-6 |
| 5 | SP1908 | ssbB, cilA | + | + | ssDNA binding | BH BS e-18 | many e-10 |
| 6 | SP1937 | lytA | + | - | Autolysin | SP e-18 | family of 15: SPN e-10 |
| 6 | SP1939 | dinF | + | - | Efflux pump | BH BS e-28 | many e-20 |
| 6 | SP1940 | recA | + | + | Strand assimilation | BH BS e-50 | many e-50 |
| 6 | SP1941 | cinA | + | - | Unknown | BH BS e-50 | SS e-50; many e-17 |
| 7 | SP2051 | cglC | + | + | Pilin-like wall structure | BH e-6 | - |
| 7 | SP2052 | cglB | + | + | Prepilin transport ATPase | BH BS e-10 | - |
| 7 | SP2053 | cglA | + | + | Prepilin transport pore | BS BH e-25 | many e-18 |
| 8 | SP2208 | cflA | + | + | Helicase? | BS BH e-50 | SS CJ SP TP AA e-6 |
| 8 | SP2207 | cflB | + | + | Uptake pilot protein? | BS BH e-21 | NM e-6 |
(1) Expression induced strongly in competent cells.
(2) Required for transformation (mutation decreases recombinants < 70%).
(c) Lacks, S. A., 1999. DNA uptake by transformable bacteria. In Transport of Molecules across Microbial Membranes, Broome-Smith, J. K., Baumberg, S., Stirling, C. J., and Ward, F. B. (eds.), pp.138-168, Cambridge University Press, Cambridge, UK.
(d) Among completed genomes, species with closest ortholog are given, with BLAST probability.
(e) Species abbreviations: BS B. subtilis, BH B. halodurans, SS Synechocystis spp., CJ Campylobacter jejuni, TP T. pallidum, AA Aquifex aeolicus, NM N. meningitidis, DR Deinococcus radiodurans, MG Mycoplasma genitalium, TM Thermotoga maritima, - none.
(f) Species with more distant homologs. |
| Supplemental Table 4. Genes related to virulence based on experimental data (a). |
| ORF (b) | Description | Gene Name | Other known Roles (b) | Support. data(c) | Reference (d) |
| Adherence |
| SP0377 | choline binding protein C | pcpC/cbpC (smaA) | | A,D | Gosink et al. 2000, Lau et al. 2001 |
| SP0660 | peptide methionine sulfoxide reductase | msrA | | A | Wizemann 1996 |
| SP0730 | pyruvate oxidase | spxB | Host defense, cellular metabolism | A,H | Overweg et al. 2000, Spellerberg 1996 |
| SP0933 | pyrrolidone-5-carboxlate reductase | proC, smmJ | | A,C | Tuomanen et al. 2000, Lau et al. 2001 |
| SP0966 | adherence and virulence protein | pavA | | A | Lau et al. 2001 |
| SP1274 | LicD2 protein | licD2 | | A,B,C | Zhang et al. 1999 |
| SP1650 | manganese ABC transporter, manganese-binding adhesion liprotein | psaA | Acquisition | A,B,C,F | Sampson et al. 1994, Dintilhac et al. 1997, Berry and Paton, 1996 |
| SP2190 | choline binding protein A | cbpA | Host defense | A,B,C | Rosenow et al. 1997 |
| SP2194 | ATP dependent Clp protease, ATP binding subunit | clpC | | A | Charpentier et al. 2000 |
| Cellular metabolism/ acquisition of nutrients |
| SP0044 | phosphoribosylaminoimidazole-succinocarboxamide synthase | purC, SPN-1646 | | B | Polissi et al. 1998 |
| SP0045 | phosphoribosylformylglycinamidine synthase, putative | purL, SPN-786 | | B | Polissi et al. 1998 |
| SP0047 | phosphoribosylformylglycinamide cyclo-ligase | purM, smmB | | C,D | Lau et al. 2001 |
| SP0048 | phosphoribosylglycinamide formyltransferase | purN, smmB | | C,D | Lau et al. 2001 |
| SP0053 | phosphoribosylaminoimidazole carboxylase, catalytic subunit | purE, SPN-1404 | | B | Polissi et al. 1998 |
| SP0054 | phosphoribosylaminoimidazole carboxylase, ATPase subunit | purK | | B,D | Polissi et al. 1998 |
| SP0251 | formate acetyltransferase, putative | smmF | | C,D | Lau et al. 2001 |
| SP0265 | glycosyl hydrolase, family 1 | SPN-1818 | | B | Polissi et al. 1998 |
| SP0267 | oxidoreductase, putative | SPN-962 | | B | Polissi et al. 1998 |
| SP0268 | alkaline amylopullulanase, putative | spuA | | G | Zysk et al. 2000, Bongaerts et al. 2000 |
| SP0314 | hyaluronidase | hyl | | D | Berry et al. 1994 |
| SP0498 | endo-beta-N-acetylglucosaminidase, putative | | | G | Zysk et al. 2000 |
| SP0648 | beta-galactosidase | bgaA | | G | Zysk et al. 2000 |
| SP0659 | thioredoxin family protein | smsB | | C,D | Lau et al. 2001 |
| SP0766 | superoxide dismutase, manganese-dependent | sodA | | D | Yesilkaya et al. 2000 |
| SP0932 | gamma-glutamyl phosphate reductase | smmK | | C | Lau et al. 2001 |
| SP0948 | PhoH family protein | SPN-1585 | | B | Polissi et al. 1998 |
| SP0965 | endo-beta-N-acetylglucosaminidase | lytB | Colonization, Host defense | B,C,E | Garcia et al. 1999, Wizemann et al. 2001 |
| SP0981 | protease maturation protein, putative | ppmA | Host defense | D,G,H | Overweg et al. 2000b |
| SP1024 | serine hydroxymethyltransferase | smmI | | C | Lau et al. 2001 |
| SP1068 | phosphoenolpyruvate carboxylase | smmE | | C,D | Lau et al. 2001 |
| SP1168 | mutator MutX protein | mutX | | B,D | Mejean et al. 1994, Polissi et al. 1998 |
| SP1326 | neuraminidase, putative | nanB | | C | Berry et al. 1996 |
| SP1445 | GMP synthase | guaA | | B | Polissi et al. 1998 |
| SP1469 | NADH oxidase | nox | | C | Auzat et al. 1999 |
| SP1498 | phosphoglucomutase | pgm | | C | Hardy et al. 2000 |
| SP1573 | lysozyme | lytC | Colonization, adherence, host defense | A,B,E | Lopez et al. 2000, Wizemann et al. 2001 |
| SP1574 | triosphosphate isomerae | tpi | | G | Zysk et al. 2000 |
| SP1687 | neuraminidase B | nanB | | E | Berry et al. 1996 |
| SP1693 | neuraminidase A, authentic frameshift | nanA | | C,D | Camara et al. 1994 |
| SP1782 | ribosomal protein L11 methyltransferase | SPN-1583 | | B | Polissi et al. 1998 |
| SP1923 | pneumolysin | ply | Host defense | C,D,E,G | Wallker et al. 1987, Berry et al. 2000 |
| SP1937 | autolysin | lytA | | C,D,E | Garcia et al. 1986, Berry et al. 2000 |
| SP2066 | threonine synthase | smmG | | D | Lau et al. 2001 |
| SP2091 | glycerol-3-phosphate dehydrogenase (NAD(P)+) | smmH, gpsA | Adherence | A,D | Lau et al. 2001, Tuomanen, 2000 |
| SP2099 | penicillin-binding protein 1B | pbp1B, smcA | | C | Lau et al. 2001 |
| SP2142 | ROK family protein | SPN-1808 | | B | Polissi et al. 1998 |
| Transporters/ proteases |
| SP0043 | competence factor transport protein | comB | | C | Lau et al. 2001 |
| SP0185 | magnesium transporter, CorA family | smtJ | | C,D | Lau et al. 2001 |
| SP0284 | PTS system, mannose-specific IIAB components | smtK | | D | Lau et al. 2001 |
| SP0366 | oligopeptide ABC transporter, oligopeptide-binding protein | aliA/plpA | Adherence | A | Tuomanen et al. 2000, Pearce 1994, Alloing 1994, Cundell et al. 1995 |
| SP0483 | ABC transporter, ATP-binding protein | smtG | | C,D | Lau et al. 2001 |
| SP0609 | amino acid ABC transporter, amino-acid binding protein | glnQ, SPN-1364 | | B | Polissi et al. 1998 |
| SP0610 | amino acid ABC transporter, ATP binding protein | glnH, SPN-1452 | | B | Polissi et al. 1998 |
| SP0641 | serine protease, subtilase family | prtA | | G,E | Zysk et al. 2000, Wizemann et al. 2001 |
| SP0664 | zinc metalloprotease ZmpB, putative | SPN-1338 | | B,C | Polissi et al. 1998 |
| SP0737 | sodium-dependent transporter | | | G | Zysk et al. 2000 |
| SP0820 | ATP-dependent Clp protease, ATP-binding subunit | clpE, SPN-1055 | | B | Polissi et al. 1998 |
| SP0823 | amino acid ABC transporter, permease protein | smtC | | C,D | Lau et al. 2001 |
| SP0913 | ABC transporter, permease protein, putative | smtF | | C,D | Lau et al. 2001 |
| SP1032 | iron-compound ABC transporter, iron compound-binding protein | smtA, pitlA | | C,D,G | Zysk et al. 2000, Brown et al. 2001 |
| SP1033 | iron-compound ABC transporter, permease protein | smtA, pitlB | | C,D | Lau et al. 2001, Brown et al. 2001 |
| SP1241 | amino acid ABC transporter, periplasmic solute-binding protein/permease protein | | | G | Zysk et al. 2000 |
| SP1342 | toxin secretion ABC transporter, ATP-binding/permease protein | SPN-948 | | B | Polissi et al. 1998 |
| SP1386 | spermidine/putrescine ABC transporter, periplasmic spermidine/putrescine-binding protein | potD, SPN-924 | | B | Polissi et al. 1998 |
| SP1389 | spermidine/putrescine ABC transporter, ATP-binding protein | potA, SPN-2041 | | B,D | Polissi et al. 1998 |
| SP1527 | oligopeptide ABC transporter, oligopeptide-binding protein | aliB | | C,D | Lau et al. 2001, Alloing et al. 1994 |
| SP1580 | sugar ABC transporter, ATP-binding protein | msmK, SPN-1802 | | B | Polissi et al. 1998 |
| SP1623 | cation-transporting ATPase, E1-E2 family | SPN-1145 | | B | Polissi et al. 1998 |
| SP1715 | ABC transporter, ATP-binding protein | smtB,H | | C,D | Lau et al. 2001 |
| SP1889 | oligopeptide ABC transporter, permease protein | amiD | Adherence | A | Cundell et al. 1995 |
| SP1891 | oligopeptide ABC transporter, oligopeptide-binding protein | amiA | Host defense, adherence | A,H | Cundell et al. 1995, Overweg et al. 2000 |
| SP1957 | ABC transporter, ATP-b inding protein | SPN-1113 | | B | Polissi et al. 1998 |
| SP2019 | ABC transporter, ATP-binding protein, truncation | | | C | Bartilson et al. 2001 |
| SP2169 | zinc ABC transporter, zinc-binding lipoprotein | adcA | | B,C | Dintilhac et al. 1997 |
| SP2220 | ABC transporter, ATP-binding protein | smtE | | C,D | Lau et al. 2001 |
| SP2230 | ABC transporter, ATP-binding protein | | | G | Zysk et al. 2000 |
| Host defense |
| SP0069 | choline binding protein I | cbpI | | | Gosink et al. 2000 |
| SP0071 | immunoglobulin A1 protease | iga, SPN-1471 | Colonization | B,D,F | Poulsen et al. 1998, Polissi et al. 1998 |
| SP0117 | pneumococcal surface protein A | pspA | Cellular metabolism | C,E,F G,H, |
Yother et al. 1994, Hollingshead et al. 2000, Hammerschmidt et al. 1999, Berry et al. 2000 |
| SP0346 | capsular polysaccharide biosynthesis protein | cps4A | Colonization | B,C | Caimano et al. 2000, Garcia et al. 2000, Paton et al. 2000 |
| SP0347 | capsular polysaccharide biosynthesis protein | cps4B | Colonization | B,C | Caimano et al. 2000, Garcia et al. 2000, Paton et al. 2000 |
| SP0348 | capsular polysaccharide biosynthesis protein | cps4C | Colonization | B,C | Caimano et al. 2000, Garcia et al. 2000, Paton et al. 2000 |
| SP0349 | capsular polysaccharide biosynthesis protein | cps4D | Colonization | B,C | Caimano et al. 2000, Garcia et al. 2000, Paton et al. 2000 |
| SP0350 | capsular polysaccharide biosynthesis protein | cps4E | Colonization | B,C | Caimano et al. 2000, Garcia et al. 2000, Paton et al. 2000 |
| SP0351 | capsular polysaccharide biosynthesis protein | cps4F | Colonization | B,C | Caimano et al. 2000, Garcia et al. 2000, Paton et al. 2000 |
| SP0352 | capsular polysaccharide biosynthesis protein | cps4G | Colonization | B,C | Caimano et al. 2000, Garcia et al. 2000, Paton et al. 2000 |
| SP0353 | capsular polysaccharide biosynthesis protein | cps4H | Colonization | B,C | Caimano et al. 2000, Garcia et al. 2000, Paton et al. 2000 |
| SP0357 | UDP-N-acetylglucosamine-2-epimerase | cps4I | Colonization | B,C | Caimano et al. 2000, Garcia et al. 2000, Paton et al. 2000 |
| SP0358 | capsular polysaccharide biosynthesis protein | cps4J | Colonization | B,C | Caimano et al. 2000, Garcia et al. 2000, Paton et al. 2000 |
| SP0359 | capsular polysaccharide biosynthesis protein | cps4K | Colonization | B,C | Caimano et al. 2000, Garcia et al. 2000, Paton et al. 2000 |
| SP0360 | UDP-N-acetylglucosamine-2-epimerase | cps4L | Colonization | B,C | Caimano et al. 2000, Garcia et al. 2000, Paton et al. 2000 |
| SP0390 | choline binding protein G | cbpG | Colonization, adherence | A,B,C | Gosink et al. 2000 |
| SP0930 | choline binding protein E | cbpE, pce | Colonization, adherence, cellular metabolism | A,B | Gosink et al. 2000 |
| SP1003 | conserved hypothetical protein | phtD | | E,G | Adamou et al. 2001 |
| SP1004 | conserved hypothetical protein | phtC | | E,G | Adamou et al. 2001 |
| SP1154 | immunoglobulin A1 protease | iga | Colonization | B,D,F | Poulsen et al. 1998 |
| SP1174 | conserved domain protein | phtB | | E,G | Adamou et al. 2001 |
| SP1175 | conserved domain protein | phtA | | E,G | Adamou et al. 2001 |
| SP1693 | neuraminidase A, authentic frameshift | nanA | | B | Tong et al. 2000 |
| SP2190 | choline binding protein A | cbpA | Adhesion | A,B,C | Rosenow et al. 1997 |
| SP2201 | choline binding protein D | cbpD | Colonization | B | Gosink et al. 2000 |
| Other categories/ unknown |
| SP0023 | DNA repair protein RadA, authentic point mutation | SPN-636 | | B | Polissi et al. 1998 |
| SP0143 | conserved domain protein | SPN-627 | | B | Polissi et al. 1998 |
| SP0175 | riboflavin synthase, beta subunit | ribH | | G | Zysk et al. 2000 |
| SP0370 | recombination protein U | SPN-633 | | B | Polissi et al. 1998 |
| SP0371 | conserved hypothetical protein | SPN-631 | | B | Polissi et al. 1998 |
| SP0629 | conserved hypothetical protein | smcB | | C,D | Lau et al. 2001 |
| SP0742 | conserved hypothetical protein | SPN-641 | | B | Polissi et al. 1998 |
| SP0761 | ATP-dependent RNA helicase, DEAD/DEAH box family | | | G | Zysk et al. 2000 |
| SP0771 | peptidyl-prolyl cis-trans isomerase, cyclophilin-type | | | G | Zysk et al. 2000 |
| SP1026 | conserved hypothetical protein | | | E | Wizemann et al. 2001 |
| SP1207 | exodeoxyribonuclease VII, large subunit | xseA | | G | Zysk et al. 2000 |
| SP1291 | Cof family protein | SPN-224 | | B | Polissi et al. 1998 |
| SP1482 | oxidoreductase, Gfo/Idh/MocA family | smuA | | C,D | Lau et al. 2001 |
| SP1637 | conserved hypothetical protein | SPN-1119 | | B | Polissi et al. 1998 |
| SP1654 | conserved hypothetical protein | smuB | | C,D | Lau et al. 2001 |
| SP1972 | membrane protein | SPN-233 | | B | Polissi et al. 1998 |
| SP1997 | Cof family protein | SPN-1101 | | B | Polissi et al. 1998 |
| SP2053 | competence protein | cglA | | C | Bartilson et al. 2001 |
| SP2116 | conserved domain protein | SPN-655 | | B | Polissi et al. 1998 |
| SP2144 | conserved hypothetical protein | SPN-1631 | | B | Polissi et al. 1998 |
| SP2145 | antigen, cell wall surface anchor family | smuD | | C,D | Lau et al. 2001 |
| SP2146 | conserved hypothetical protein | SPN-1200 | | B | Polissi et al. 1998 |
| SP2236 | putative sensor histidine kinase | comD | | C,D | Bartilson et al. 2001 |
|
(a) The table is limited to genes for which experimental data supports a role in virulence. This list connects data in the scientific literature prior to May 2001 with specific gene loci in the TIGR4 isolate. A broad definition of virulence as anything required for the infectious process but not for life in vitro has been used. This non-exhaustive list demonstrates the range of factors that are required to maintain the complex interaction between the bacterium and its host. We thank Dr. Gregor Zysk and Dr. Andrea Polissi for sharing specific sequence data not deposited in Genbank (those for which nucleotide sequence identity is greater than 92% are displayed).
(b) The genes have been divided into four general categories: Adherence/Colonization, Cellular Metabolism/acquisition of Nutrients, Transporters/proteases, Host Defense and Other categories/unknown.
(c) Eight specific types of experimental data justified inclusion in the table: A- in vitro adherence assay, B- gene knockout ineffective in mouse, rat or chinchilla nasopharyngeal colonization model, C- gene knockout ineffective in mouse septicemia model, D- gene knockout ineffective in mouse S. pneumoniae respiratory tract model, E- antibodies are protective against invasive disease in animal model, F- antibodies are protective against colonization in animal model, G- antibodies are elicited during infection in humans, H- antibodies show opsonic activity with human polymorphonuclear leukocytes.
(d) References: P. Garcia, J. L. Garcia, E. Garcia, R. Lopez, Gene 43, 265 (1986); J. A. Walker, R. L. Allen, P. Falmagne, M. K. Johnson, G. J. Boulnois, Infect Immun 55, 1184 (1987); G. Alloing, P. de Philip, J. P. Claverys, J Mol Biol 241, 44 (1994); A. M. Berry et al., Infect Immun 62, 1101 (1994); M. Camara, G. J. Boulnois, P. W. Andrew, T. J. Mitchell, Infect Immun 62, 3688 (1994); V. Mejean, C. Salles, L. C. Bullions, M. J. Bessman, J. P. Claverys, Mol Microbiol 11, 323 (1994); B. J. Pearce, A. M. Naughton, H. R. Masure, Mol Microbiol 12, 881 (1994); J. S. Sampson, S. P. O'Connor, A. R. Stinson, J. A. Tharpe, H. Russell, Infect Immun 62, 319 (1994); J. Yother, J. M. White, J Bacteriol 176, 2976 (1994); D. R. Cundell, B. J. Pearce, J. Sandros, A. M. Naughton, H. R. Masure, Infect Immun 63, 2493 (1995); A. M. Berry, J. C. Paton, Infect Immun 64, 5255 (1996); A. M. Berry, R. A. Lock, J. C. Paton, J Bacteriol 178, 4854 (1996); B. Spellerberg et al., Mol Microbiol 19, 803 (1996); T. M. Wizemann et al., Proc Natl Acad Sci U S A 93, 7985 (1996); A. Dintilhac, G. Alloing, C. Granadel, J. P. Claverys, Mol Microbiol 25, 727 (1997); C. Rosenow et al., Mol Microbiol 25, 819 (1997); A. Polissi et al., Infect Immun 66, 5620 (1998); K. Poulsen et al., Infect Immun 66, 181 (1998); I. Auzat et al., Mol Microbiol 34, 1018 (1999); P. Garcia, M. P. Gonzalez, E. Garcia, R. Lopez, J. L. Garcia, Mol Microbiol 31, 1275 (1999); S. Hammerschmidt, G. Bethe, H. R. P, G. S. Chhatwal, Infect Immun 67, 1683 (1999); J. R. Zhang, I. Idanpaan-Heikkila, W. Fischer, E. I. Tuomanen, Mol Microbiol 31, 1477 (1999); R. J. Bongaerts, H. P. Heinz, U. Hadding, G. Zysk, Infect Immun 68, 7141 (2000); A. M. Berry, J. C. Paton, Infect Immun 68, 133 (2000); M. J. Caimano, G. G. Hardy, J. Yother, in Streptococcus pneumoniae - Molecular biology and mechanisms of disease A. Tomasz, Ed. (Mary Ann Liebert, Larchmont, NY, 2000) pp. 115; E. Charpentier, R. Novak, E. Tuomanen, Mol Microbiol 37, 717 (2000); E. Garcia, C. Arrecubieta, R. Munoz, M. Mollerach, R. Lopez, in Streptococcus pneumoniae - Molecular biology and mechanisms of disease A. Tomasz, Ed. (Mary Ann Liebert, Larchmont, NY, 2000) pp. 139; K. K. Gosink, E. R. Mann, C. Guglielmo, E. I. Tuomanen, H. R. Masure, Infect Immun 68, 5690 (2000); G. G. Hardy, M. J. Caimano, J. Yother, J Bacteriol 182, 1854 (2000); S. K. Hollingshead, R. Becker, D. E. Briles, Infect Immun 68, 5889 (2000); R. Lopez, M. P. Gonzalez, E. Garcia, J. L. Garcia, P. Garcia, Res Microbiol 151, 437 (2000); K. Overweg et al., Infect Immun 68, 4604 (2000); K. Overweg et al., Infect Immun 68, 4180 (2000b); J. C. Paton, J. K. Morona, R. Morona, in Streptococcus pneumoniae - Molecular biology and mechanisms of disease A. Tomasz, Ed. (Mary Ann Liebert, Larchmont, NY, 2000) pp. 129; H. H. Tong, L. E. Blue, M. A. James, T. F. DeMaria, Infect Immun 68, 921 (2000); E. I. Tuomanen, H. R. Masure, in Streptococcus pneumoniae - Molecular biology and mechanisms of disease A. Tomasz, Ed. (Mary Ann Liebert, Larchmont, NY, 2000) pp. 295; H. Yesilkaya et al., Infect Immun 68, 2819 (2000); G. Zysk et al., Infect Immun 68, 3740 (2000); J. E. Adamou et al., Infect Immun 69, 949 (2001); M. Bartilson et al., Mol Microbiol 39, 126 (2001); J. S. Brown, S. M. Gilliland, D. W. Holden, Mol Microbiol 40, 572 (2001); G. W. Lau et al., Mol Microbiol 40, 555 (2001); T. M. Wizemann et al., Infect Immun 69, 1593 (2001). |
| Supplemental Table 5. Complete list of genes containing stretches of iterative DNA that could induce phase-variation (a). |
| ORF | Repeat | Region | Description |
| SP0001 | CCCCCCC | middle | chromosomal replication initiator protein DnaA |
| SP0001 | AAAAAAAA | 3prime | chromosomal replication initiator protein DnaA |
| SP0006 | TGATGATGATGAT | middle | transcription-repair coupling factor |
| SP0014 | GGGGGG | middle | transcriptional regulator ComX1 |
| SP0035 | GGGGGG | promot | aromatic amino acid aminotransferase |
| SP0037 | GGGGGG | 5prime | fatty acid/phospholipid synthesis protein PlsX |
| SP0071 | ATATATAT | middle | immunoglobulin A1 protease |
| SP0071 | TATATATA | 3prime | immunoglobulin A1 protease |
| SP0080 | AGAGAGAG | 3prime | hypothetical protein |
| SP0097 | CACACACA | 3prime | conserved domain protein |
| SP0097 | TGTGTGTG | middle | conserved domain protein |
| SP0100 | TATATATA | promot | conserved hypothetical protein |
| SP0102 | GGGGGG | middle | glycosyl transferase |
| SP0106 | GAGAGAGA | middle | L-serine dehydratase, iron-sulfur-dependent, beta subunit |
| SP0111 | AGAGAGAG | middle | amino acid ABC transporter, ATP-binding protein, putative |
| SP0129 | CCCCCC | 3prime | glycoprotease family protein |
| SP0130 | CCCCCC | 3prime | IS1167, transposase, degenerate |
| SP0130 | CTCTCTCT | middle | IS1167, transposase, degenerate |
| SP0131 | CCCCCC | 3prime | IS630-Spn1, transposase Orf2, degenerate |
| SP0133 | AAAAAAAA | 3prime | hypothetical protein |
| SP0134 | AAAAAAAA | promot | hypothetical protein |
| SP0137 | TTTTTTTT | middle | ABC transporter, ATP-binding protein |
| SP0137 | AAAAAAAA | 3prime | ABC transporter, ATP-binding protein |
| SP0138 | TATATATAT | 3prime | hypothetical protein |
| SP0144 | TTTTTTTT | middle | hypothetical protein |
| SP0145 | TCTCTCTC | 3prime | conserved hypothetical protein |
| SP0153 | AGGAAGGAAGGAA | middle | hypothetical protein |
| SP0163 | ATATATATAT | 5prime | transcriptional regulator PlcR, putative |
| SP0167 | ATATATAT | 3prime | hypothetical protein |
| SP0168 | TTATTATTATTA | 5prime | macrolide efflux protein, putative |
| SP0178 | ATATATAT | 3prime | riboflavin biosynthesis protein RibD |
| SP0188 | AAAAAAAA | middle | hypothetical protein |
| SP0205 | CTCTCTCT | 5prime | anaerobic ribonucleoside-triphosphate reductase activating protein |
| SP0210 | GTGGTGGTGGTG | middle | ribosomal protein L4 |
| SP0227 | TTGTTGTTGTTG | 5prime | ribosomal protein S5 |
| SP0254 | GAGAGAGA | promot | leucyl-tRNA synthetase |
| SP0259 | TTTTTTTT | promot | Holliday junction DNA helicase RuvB |
| SP0274 | CTCTCTCTC | middle | DNA polymerase III, alpha subunit, Gram-positive type |
| SP0274 | GGGGGG | 3prime | DNA polymerase III, alpha subunit, Gram-positive type |
| SP0278 | CTCTCTCT | 5prime | aminopeptidase PepS |
| SP0288 | CCCCCC | promot | conserved hypothetical protein |
| SP0294 | TATATATA | promot | ribosomal protein L13 |
| SP0312 | GAGAGAGA | 5prime | glycosyl hydrolase, family 31 |
| SP0319 | CCCCCC | middle | conserved domain protein |
| SP0338 | CTCTCTCTC | 5prime | ATP-dependent Clp protease, ATP-binding subunit, putative |
| SP0346 | TATTTATTTATT | 5prime | capsular polysaccharide biosynthesis protein Cps4A |
| SP0349 | AAAAAAAA | 5prime | capsular polysaccharide biosynthesis protein Cps4D |
| SP0350 | AGAGAGAGA | middle | capsular polysaccharide biosynthesis protein Cps4E |
| SP0351 | AAAAAAAA | 5prime | capsular polysaccharide biosynthesis protein Cps4F |
| SP0351 | AAAAAAAAA | 5prime | capsular polysaccharide biosynthesis protein Cps4F |
| SP0352 | ATATATAT | 5prime | capsular polysaccharide biosynthesis protein Cps4G |
| SP0352 | TTTTTTTT | middle | capsular polysaccharide biosynthesis protein Cps4G |
| SP0353 | AAAAAAAA | 5prime | capsular polysaccharide biosynthesis protein Cps4H |
| SP0356 | GAGAGAGA | 5prime | O-antigen transporter RfbX, putative |
| SP0362 | GGGGGG | middle | IS66 family element, Orf3, degenerate |
| SP0362 | TCTCTCTC | 5prime | IS66 family element, Orf3, degenerate |
| SP0372 | GAGAGAGA | promot | conserved hypothetical protein |
| SP0380 | TGTGTGTGT | 5prime | hypothetical protein |
| SP0380 | GGGGGG | 5prime | hypothetical protein |
| SP0386 | TTTTTTTT | 5prime | sensor histidine kinase, putative |
| SP0387 | TCTCTCTC | 3prime | DNA-binding response regulator |
| SP0394 | GGGGGG | middle | PTS system, mannitol-specific IIBC components |
| SP0401 | TTTTTTTT | middle | helicase, putative |
| SP0411 | AGAGAGAG | 3prime | seryl-tRNA synthetase |
| SP0415 | GGGGGG | middle | enoyl-CoA hydratase/isomerase family protein |
| SP0419 | GGGGGG | middle | enoyl-(acyl-carrier-protein) reductase |
| SP0422 | CTCTCTCT | middle | 3-oxoacyl-(acyl-carrier-protein) synthase II |
| SP0437 | AGCAGCAGCAGC | 3prime | glutamyl-tRNA(Gln) amidotransferase, A subunit |
| SP0453 | AGAGAGAG | middle | amino acid ABC transporter, amino acid-binding protein/permease protein |
| SP0458 | TCTTTCTTTCTT | 3prime | DNA-damage inducible protein P |
| SP0460 | AGAGAGAG | 3prime | IS1167, transposase |
| SP0460 | GGTAGAGGTAGAGGTAGAGGTAGAG | 5prime | IS1167, transposase |
| SP0462 | GAGAGAGAG | middle | cell wall surface anchor family protein |
| SP0481 | CTCTCTCT | 3prime | conserved hypothetical protein |
| SP0484 | GAGAGAGA | middle | conserved hypothetical protein |
| SP0494 | GCTGCTGCTGCT | 3prime | CTP synthase |
| SP0496 | AAAAAAAA | 5prime | Na/Pi cotransporter II-related protein |
| SP0505 | CCCCCC | 5prime | type I restriction-modification system, S subunit, putative |
| SP0508 | GGGGGG | middle | type I restriction-modification system, S subunit |
| SP0514 | TCTTTCTTTCTT | 3prime | hypothetical protein |
| SP0523 | ACTGGACTGGACTGG | middle | ABC transporter, permease protein, putative |
| SP0532 | GGGGGG | 5prime | bacteriocin BlpJ |
| SP0544 | AAAAAAAA | 3prime | immunity protein BlpX |
| SP0547 | CTCTCTCT | 3prime | conserved domain protein |
| SP0559 | GAGAGAGA | 3prime | hypothetical protein |
| SP0560 | GAGAGAGA | 3prime | hypothetical protein |
| SP0565 | GAGAGAGA | 5prime | conserved domain protein |
| SP0567 | ACACACACA | 3prime | conserved domain protein |
| SP0568 | ACACACACA | promot | valyl-tRNA synthetase |
| SP0570 | TTTTTTTT | 3prime | conserved domain protein |
| SP0570 | AAAAAAAA | 3prime | conserved domain protein |
| SP0574 | GGGGGG | 3prime | hypothetical protein |
| SP0575 | GAGAGAGA | middle | helicase, putative |
| SP0577 | ATGATGATGATG | 5prime | PTS system, beta-glucosides-specific IIABC components |
| SP0580 | GAGAGAGA | middle | acetyltransferase, GNAT family |
| SP0582 | AAAAAAAA | 5prime | hypothetical protein |
| SP0590 | GAGAGAGA | 5prime | acetyltransferase, GNAT family |
| SP0593 | TCTCTCTCT | 3prime | leucine-rich protein |
| SP0595 | TATATATA | middle | hypothetical protein |
| SP0604 | CTCTCTCT | 5prime | sensor histidine kinase VncS |
| SP0604 | GAGAGAGA | middle | sensor histidine kinase VncS |
| SP0609 | TTTTTTTT | promot | amino acid ABC transporter, amino acid-binding protein |
| SP0614 | GGGGGG | 5prime | tributyrin esterase |
| SP0615 | ATATATAT | 3prime | beta-lactam resistance factor |
| SP0618 | TGTGTGTG | middle | excinuclease ABC, subunit C |
| SP0618 | GGGGGG | 3prime | excinuclease ABC, subunit C |
| SP0621 | ATATATAT | promot | hypothetical protein |
| SP0636 | CTCTCTCT | 3prime | ABC transporter, ATP-binding protein |
| SP0641 | TATATATA | 5prime | serine protease, subtilase family |
| SP0641 | ATGATGATGATG | middle | serine protease, subtilase family |
| SP0641 | AAAAAAAA | 3prime | serine protease, subtilase family |
| SP0644 | GAGAGAGA | 5prime | IS66 family element, Orf3, degenerate |
| SP0652 | GAGAGAGA | 3prime | conserved hypothetical protein |
| SP0663 | GAGAGAGA | 3prime | conserved hypothetical protein |
| SP0664 | CAAAACAAAACAAAA | 5prime | zinc metalloprotease ZmpB, putative |
| SP0683 | AGAGAGAG | promot | hypothetical protein |
| SP0689 | GGGGGG | 5prime | UDP-N-acetylglucosamine--N-acetylmuramyl-(pentapeptide) pyrophosphoryl-undecaprenol N-acetylglucosamine transferase |
| SP0689 | GGGGGG | 5prime | UDP-N-acetylglucosamine--N-acetylmuramyl-(pentapeptide) pyrophosphoryl-undecaprenol N-acetylglucosamine transferase |
| SP0695 | ATATATATAT | middle | HesA/MoeB/ThiF family protein |
| SP0697 | TTTTTTTT | promot | ABC transporter, ATP-binding protein, authentic point mutation |
| SP0704 | TTTTTTTT | 5prime | hypothetical protein |
| SP0704 | TTATTATTATTA | middle | hypothetical protein |
| SP0705 | ATATATAT | 5prime | hypothetical protein |
| SP0705 | TTTTTTTT | middle | hypothetical protein |
| SP0715 | GGGGGG | middle | lactate oxidase |
| SP0719 | CTCTCTCT | middle | conserved hypothetical protein |
| SP0729 | GCTTGCTTGCTT | 3prime | cation-transporting ATPase, E1-E2 family |
| SP0753 | GGGGGGG | middle | branched-chain amino acid ABC transporter, ATP-binding protein |
| SP0758 | GGGGGG | 5prime | PTS system, IIABC components |
| SP0762 | GGGGGG | promot | S-adenosylmethionine synthetase |
| SP0785 | GGGGGG | 5prime | conserved hypothetical protein |
| SP0797 | CTCTCTCT | 5prime | aminopeptidase N |
| SP0798 | TTTTTTTT | promot | DNA-binding response regulator CiaR |
| SP0802 | CACACACA | 5prime | DNA polymerase III, epsilon subunit/ATP-dependent helicase DinG |
| SP0807 | AAAAAAAA | 5prime | septation ring formation regulator EzrA, putative |
| SP0818 | AAAAAAAAAA | 3prime | IS630-Spn1, transposase Orf1, authentic frameshift |
| SP0829 | GAGAGAGA | 3prime | phosphopentomutase |
| SP0837 | CTCTCTCT | 3prime | DNA topology modulation protein FlaR, putative |
| SP0842 | AAAAAAAA | 5prime | pyrimidine-nucleoside phosphorylase |
| SP0846 | GGGGGG | 5prime | sugar ABC transporter, ATP-binding protein |
| SP0851 | AGAGAGAG | 3prime | conserved hypothetical protein |
| SP0852 | GCGCGCGC | 3prime | topoisomerase IV, subunit B |
| SP0872 | AAAAAAAA | promot | D-alanyl-D-alanine carboxypeptidase |
| SP0887 | AATAATAATAAT | 5prime | type I restriction-modification system, S subunit, putative |
| SP0887 | AAAAAAAA | 3prime | type I restriction-modification system, S subunit, putative |
| SP0887 | CCCCCC | 3prime | type I restriction-modification system, S subunit, putative |
| SP0890 | GAGAGAGAG | 5prime | integrase/recombinase, phage integrase family |
| SP0891 | AAAAAAAA | 3prime | type I restriction-modification system, S subunit, putative |
| SP0891 | CCCCCC | 3prime | type I restriction-modification system, S subunit, putative |
| SP0892 | TCTCTCTC | 3prime | type I restriction-modification system, R subunit, putative |
| SP0894 | AAAAAAAAA | promot | X-pro dipeptidyl-peptidase |
| SP0895 | TGATATTGATATTGATAT | middle | DNA polymerase III, alpha subunit |
| SP0897 | ATATATATA | promot | pyruvate kinase |
| SP0901 | AAAAAAAA | 3prime | hypothetical protein |
| SP0907 | GGGGGG | 5prime | capsular polysaccharide biosynthesis protein, putative |
| SP0911 | GTGTGTGTGT | middle | hypothetical protein |
| SP0931 | CTCTCTCTC | 3prime | glutamate 5-kinase |
| SP0966 | AAAAAAAA | 5prime | adherence and virulence protein A |
| SP0981 | AAAAAAAA | 5prime | protease maturation protein, putative |
| SP0987 | AAAAAAAA | promot | hypothetical protein |
| SP0994 | AGAGAGAG | middle | hypothetical protein |
| SP0999 | TCTCTCTC | 5prime | cytochrome c-type biogenesis protein CcdA |
| SP1001 | TGTGTGTG | 5prime | amino acid permease family protein |
| SP1005 | TATATATA | middle | conserved domain protein, degenerate |
| SP1010 | TTTTTTTT | 3prime | large conductance mechanosensitive channel protein MscL |
| SP1033 | TATATATA | 5prime | iron-compound ABC transporter, permease protein |
| SP1045 | GAGAGAGA | 5prime | conserved hypothetical protein TIGR00147 |
| SP1052 | TTTTTTTT | 5prime | phosphoesterase, putative |
| SP1057 | ATATATATAT | 5prime | transcriptional regulator PlcR, putative |
| SP1062 | AAAAAAAA | middle | ABC transporter, ATP-binding protein |
| SP1063 | TATATATATA | middle | ABC-2 transporter, permease protein, putative |
| SP1083 | AGAGAGAGA | 5prime | conserved hypothetical protein |
| SP1083 | GGGGGG | 5prime | conserved hypothetical protein |
| SP1083 | GAGAGAGA | middle | conserved hypothetical protein |
| SP1087 | TGTGTGTG | 5prime | ATP-dependent DNA helicase PcrA |
| SP1090 | AGAGAGAGAG | 5prime | conserved hypothetical protein |
| SP1113 | CAGCAGCAGCAG | 5prime | DNA-binding protein HU |
| SP1130 | TTTTTTTT | 3prime | transcriptional regulator |
| SP1130 | ATATATAT | 3prime | transcriptional regulator |
| SP1151 | GAGAGAGA | middle | exonuclease RexB |
| SP1152 | CTCTCTCT | middle | exonuclease RexA |
| SP1153 | GAGAGAGA | middle | hypothetical protein |
| SP1153 | AAAAAAAA | 3prime | hypothetical protein |
| SP1160 | TTTTTTTT | middle | lipoate-protein ligase, putative |
| SP1166 | AAATAAATAAAT | 5prime | MATE efflux family protein |
| SP1168 | CTCTCTCT | 5prime | mutator MutT protein |
| SP1171 | TGTGTGTG | 3prime | hydrolase, haloacid dehalogenase-like family |
| SP1215 | TTTTTTTT | middle | transporter, FNT family, putative |
| SP1219 | GCGCGCGC | middle | DNA gyrase subunit A |
| SP1219 | ATTCATTCATTC | 5prime | DNA gyrase subunit A |
| SP1222 | TTTTTTTT | middle | type II restriction endonuclease, putative |
| SP1238 | CCCCCC | 5prime | excinuclease ABC, subunit B |
| SP1260 | CCCCCC | 3prime | copper homeostasis protein CutC |
| SP1263 | TTTTTTTT | 5prime | DNA topoisomerase I |
| SP1264 | AGAGAGAG | 5prime | conserved domain protein |
| SP1267 | AGAGAGAG | middle | licC protein |
| SP1267 | ATGATGATGATG | 5prime | licC protein |
| SP1272 | CTCTCTCT | middle | polysaccharide biosynthesis protein, putative |
| SP1272 | CTCTCTCT | 3prime | polysaccharide biosynthesis protein, putative |
| SP1274 | AAAAAAAA | 5prime | licD2 protein |
| SP1283 | CCCCCC | 3prime | heat shock protein HtpX |
| SP1286 | GGGGGGGG | 5prime | uracil permease |
| SP1305 | AAAAAAAA | middle | hypothetical protein |
| SP1311 | GGGGGGG | middle | IS66 family element, Orf3, degenerate |
| SP1311 | TCTCTCTC | 5prime | IS66 family element, Orf3, degenerate |
| SP1316 | GAGAGAGAGA | 3prime | v-type sodium ATP synthase, subunit B |
| SP1321 | TATATATAT | promot | v-type sodium ATP synthase, subunit K |
| SP1326 | TTTTTTTT | 5prime | neuraminidase, putative |
| SP1336 | CTCTCTCT | middle | type II DNA modification methyltransferase Spn5252IP |
| SP1340 | AATAATAATAAT | 5prime | hypothetical protein |
| SP1340 | CCCCCC | 5prime | hypothetical protein |
| SP1341 | TTTTTTTT | middle | ABC transporter, ATP-binding protein |
| SP1342 | TATATATAT | 3prime | toxin secretion ABC transporter, ATP-binding/permease protein |
| SP1344 | TATATATAT | 3prime | conserved hypothetical protein |
| SP1356 | TCCTTCCTTCCT | 3prime | Atz/Trz family protein |
| SP1356 | TCTCTCTC | middle | Atz/Trz family protein |
| SP1358 | CCCCCC | 3prime | ABC transporter, ATP-binding/permease protein |
| SP1361 | TGTGTGTG | 3prime | homoserine dehydrogenase |
| SP1361 | TGATTGATTGATT | 5prime | homoserine dehydrogenase |
| SP1364 | ATTATTATTATTA | 5prime | hypothetical protein |
| SP1368 | TCATCATCATCA | 3prime | psr protein |
| SP1368 | TCTCTCTC | 5prime | psr protein |
| SP1374 | CCCCCC | 5prime | chorismate synthase |
| SP1375 | TTCTTTCTTTCTT | 5prime | 3-dehydroquinate synthase |
| SP1375 | AGAGAGAG | 5prime | 3-dehydroquinate synthase |
| SP1380 | GAGAGAGA | 3prime | hypothetical protein |
| SP1383 | CGCGCGCG | middle | alanyl-tRNA synthetase |
| SP1392 | CCCCCC | middle | alpha-acetolactate decarboxylase |
| SP1393 | AAAAAAAA | promot | conserved hypothetical protein |
| SP1402 | CCCCCC | 5prime | NOL1/NOP2/sun family protein |
| SP1416 | AAAAAAAA | 3prime | S-adenosylmethionine:tRNA ribosyltransferase-isomerase |
| SP1417 | TTTTTTTT | middle | pspC protein, degenerate |
| SP1430 | AGAGAGAG | promot | type II restriction endonuclease, putative, authentic point mutation |
| SP1431 | TATATATA | middle | type II DNA modification methyltransferase, putative |
| SP1431 | AAAAAAAA | 3prime | type II DNA modification methyltransferase, putative |
| SP1441 | GGGGGG | middle | IS66 family element, Orf3, degenerate |
| SP1445 | GCTTGCTTGCTT | middle | GMP synthase |
| SP1450 | CTCTCTCTC | middle | platelet activating factor, putative |
| SP1457 | TCTCTCTCT | 5prime | spoU rRNA methylase family protein |
| SP1458 | CACACACA | middle | thioredoxin reductase |
| SP1465 | AAAAAAAAA | middle | hypothetical protein |
| SP1478 | TCTCTCTC | 3prime | oxidoreductase, aldo/keto reductase family |
| SP1479 | AGAGAGAG | 3prime | peptidoglycan N-acetylglucosamine deacetylase A |
| SP1483 | ACACACAC | 3prime | ATP-dependent RNA helicase, DEAD/DEAH box family |
| SP1492 | CTCTCTCT | 3prime | cell wall surface anchor family protein |
| SP1493 | TCTCTCTC | 5prime | hypothetical protein |
| SP1506 | GGGGGG | 5prime | conserved hypothetical protein |
| SP1506 | CCCCCC | middle | conserved hypothetical protein |
| SP1519 | TCTCTCTC | 3prime | acetyltransferase, GNAT family |
| SP1526 | CCCCCC | 3prime | ABC transporter, ATP-binding protein, authentic frameshift |
| SP1526 | ATATATAT | middle | ABC transporter, ATP-binding protein, authentic frameshift |
| SP1529 | CCCCCC | 3prime | polysaccharide biosynthesis protein, putative |
| SP1547 | AGAGAGAGA | middle | conserved hypothetical protein |
| SP1560 | TTTTTTTT | 5prime | conserved hypothetical protein |
| SP1561 | TTTTTTTT | 3prime | conserved hypothetical protein |
| SP1562 | AAAAAAAA | promot | hypothetical protein |
| SP1563 | AAAAAAAA | promot | pyridine nucleotide-disulphide oxidoreductase family protein |
| SP1573 | TTTTTTTT | middle | lysozyme |
| SP1576 | CCCCCC | middle | homoserine O-succinyltransferase |
| SP1577 | TTTCTTTCTTTC | promot | adenine phosphoribosyltransferase |
| SP1596 | GGGGGG | promot | IS3-Spn1, hypothetical protein, interruption |
| SP1604 | CCCCCCC | middle | hypothetical protein |
| SP1604 | TTTTTTTTT | promot | hypothetical protein |
| SP1605 | TTTTTTTTT | promot | ferredoxin |
| SP1612 | AAAAAAAA | 5prime | conserved domain protein |
| SP1612 | AATAAATAAATA | middle | conserved domain protein |
| SP1617 | TTTTTTTT | middle | PTS system, IIC component |
| SP1617 | CCCCCC | 5prime | PTS system, IIC component |
| SP1619 | CTCTCTCT | middle | PTS system, IIA component |
| SP1621 | TCTCTCTC | middle | transcription antiterminator BglG family protein, authentic frameshift |
| SP1623 | CCCCCC | middle | cation-transporting ATPase, E1-E2 family |
| SP1623 | TTTTTTTT | promot | cation-transporting ATPase, E1-E2 family |
| SP1624 | GGGGGG | middle | acyltransferase family protein |
| SP1625 | AAAAAAAA | promot | cadmium resistance transporter, putative |
| SP1626 | TTTTTTTT | 5prime | ribosomal protein S15 |
| SP1631 | CTCTCTCT | middle | threonyl-tRNA synthetase |
| SP1645 | TTGGTTGGTTGG | 3prime | GTP pyrophosphokinase |
| SP1645 | CCCCCC | 3prime | GTP pyrophosphokinase |
| SP1648 | TTTTTTTTTT | promot | manganese ABC transporter, ATP-binding protein |
| SP1652 | CTCTCTCT | middle | hypothetical protein |
| SP1654 | AACCAACCAACCA | middle | conserved hypothetical protein |
| SP1671 | CTGACTGACTGA | 5prime | D-alanine--D-alanine ligase |
| SP1681 | TTTTTTTT | 5prime | sugar ABC transporter, permease protein |
| SP1686 | CCCCCC | middle | oxidoreductase, Gfo/Idh/MocA family |
| SP1693 | TTTTTTTT | 5prime | neuraminidase A, authentic frameshift |
| SP1697 | CCCCCC | 3prime | ATP-dependent DNA helicase RecG |
| SP1702 | CCCCCCC | 5prime | preprotein translocase, SecA subunit |
| SP1708 | CCCCCC | promot | hypothetical protein |
| SP1709 | ATATATAT | 5prime | phosphoglycerate dehydrogenase-related protein |
| SP1716 | CCCCCC | 3prime | conserved hypothetical protein |
| SP1731 | CCCCCC | middle | conserved hypothetical protein |
| SP1737 | TTCTTCTTCTTCTT | 3prime | DNA-directed RNA polymerase, omega subunit, putative |
| SP1739 | CTCTCTCT | 5prime | KH domain protein |
| SP1747 | CCCCCC | 5prime | conserved hypothetical protein |
| SP1749 | GGGGGG | 5prime | GTP-binding protein |
| SP1751 | TCTCTCTC | middle | magnesium transporter, CorA family, putative |
| SP1761 | CCCCCC | middle | hypothetical protein |
| SP1764 | CCCCCCCC | 5prime | glycosyl transferase, family 2 |
| SP1766 | CCCCCC | 5prime | glycosyl transferase, family 8 |
| SP1768 | AAAAAAAA | middle | conserved hypothetical protein |
| SP1769 | CTCTCTCT | middle | glycosyl transferase, authentic frameshift |
| SP1769 | CCCCCCCCC | 5prime | glycosyl transferase, authentic frameshift |
| SP1772 | (TCAGCGTCGACAAGTGCGTCGGCC)540 | 5prime
middle 3prime | cell wall surface anchor family protein |
| SP1799 | TTTTTTTT | 5prime | sugar-binding transcriptional regulator, LacI family |
| SP1800 | TCTCTCTC | 5prime | transcriptional activator, putative |
| SP1820 | CACACACA | middle | hypothetical protein |
| SP1821 | TTTTTTTT | 5prime | sugar-binding transcriptional regulator, LacI family |
| SP1823 | CTCTCTCT | 5prime | MgtC/SapB family protein |
| SP1836 | TCTCTCTCT | middle | hypothetical protein |
| SP1844 | AAAAAAAA | middle | hypothetical protein |
| SP1850 | AAAAAAAA | middle | type II restriction endonuclease DpnI |
| SP1852 | CCCCCC | middle | galactose-1-phosphate uridylyltransferase |
| SP1855 | CCCCCC | 5prime | alcohol dehydrogenase, zinc-containing |
| SP1865 | CCCCCC | middle | glutamyl-aminopeptidase |
| SP1871 | GGGGGG | middle | iron-compound ABC transporter, ATP-binding protein |
| SP1872 | CAAGCAAGCAAG | middle | iron-compound ABC transporter, iron-compound-binding protein |
| SP1875 | AGAGAGAG | 5prime | conserved hypothetical protein |
| SP1883 | ACACACAC | 5prime | dextran glucosidase DexS, putative |
| SP1891 | TTTTTTTT | 5prime | oligopeptide ABC transporter, oligopeptide-binding protein AmiA |
| SP1892 | TTTTTTTT | promot | hypothetical protein |
| SP1898 | CCACCACCACCA | 3prime | alpha-galactosidase |
| SP1899 | AAAAAAAA | middle | msm operon regulatory protein |
| SP1914 | TTTTTTTT | middle | hypothetical protein |
| SP1914 | AAAAAAAA | 5prime | hypothetical protein |
| SP1934 | CCCCCC | promot | hypothetical protein |
| SP1945 | TCTCTCTC | promot | hypothetical protein |
| SP1948 | GGGGGG | 3prime | conserved domain protein |
| SP1949 | GGGGGG | promot | hypothetical protein |
| SP1950 | TTTTTTTTT | promot | bacteriocin formation protein, putative |
| SP1951 | GGGGGG | middle | conserved hypothetical protein |
| SP1952 | AAAAAAAA | 5prime | hypothetical protein |
| SP1952 | TTTTTTTT | middle | hypothetical protein |
| SP1954 | AAAAAAAAA | 5prime | serine protease, subtilase family, authentic frameshift |
| SP1955 | ACACACACA | middle | hypothetical protein |
| SP1967 | TTTTTTTT | 5prime | conserved hypothetical protein |
| SP1968 | TTTTTTTT | 3prime | phosphopantetheine adenylyltransferase |
| SP1968 | AAAAAAAA | 5prime | phosphopantetheine adenylyltransferase |
| SP1969 | CTCTCTCTC | 3prime | type II DNA modification methyltransferase, putative |
| SP1971 | AAAAAAAA | middle | hypothetical protein |
| SP1973 | TTTTTTTT | 5prime | spoU rRNA Methylase family protein |
| SP1980 | CATCATCATCAT | 3prime | cmp-binding-factor 1 |
| SP1984 | CGCGCGCG | 5prime | conserved hypothetical protein TIGR00157 |
| SP1994 | CCCCCC | 3prime | aminotransferase, class I |
| SP1997 | AAAAAAAA | 5prime | Cof family protein |
| SP2006 | CCCCCC | middle | transcriptional regulator ComX2 |
| SP2017 | TCTCTCTCT | 5prime | membrane protein |
| SP2020 | GAGAGAGA | 5prime | transcriptional regulator, GntR family |
| SP2020 | TAAATAAATAAATA | 5prime | transcriptional regulator, GntR family |
| SP2021 | AAAAAAAA | 5prime | glycosyl hydrolase, family 1 |
| SP2021 | AGAGAGAGA | middle | glycosyl hydrolase, family 1 |
| SP2054 | ATTCATTCATTC | promot | conserved hypothetical protein |
| SP2059 | CCCCCC | 5prime | conserved hypothetical protein |
| SP2064 | TATATATA | middle | hydrolase, haloacid dehalogenase-like family |
| SP2067 | ACACACAC | 3prime | hypothetical protein |
| SP2072 | ATGAATGAATGA | 5prime | glutamine amidotransferase, class-I |
| SP2072 | TTTTTTTT | promot | glutamine amidotransferase, class-I |
| SP2077 | TTTTTTTT | 5prime | transcriptional repressor, putative |
| SP2079 | TTTTTTTT | promot | IS1381, transposase OrfB |
| SP2086 | CTCTCTCT | 5prime | phosphate ABC transporter, permease protein |
| SP2098 | AGAGAGAG | 3prime | membrane protein |
| SP2111 | GGGGGG | 5prime | malA protein |
| SP2114 | CGCGCGCG | middle | aspartyl-tRNA synthetase |
| SP2117 | CCCCCC | promot | hypothetical protein |
| SP2126 | CGTCCGTCCGTC | 3prime | dihydroxy-acid dehydratase |
| SP2133 | AAAAAAAA | 5prime | conserved domain protein |
| SP2136 | TTTTTTTT | 5prime | choline binding protein PcpA |
| SP2136 | TTTTTTTT | 5prime | choline binding protein PcpA |
| SP2145 | GGGGGG | 5prime | antigen, cell wall surface anchor family |
| SP2159 | GTGTGTGT | middle | fucolectin-related protein |
| SP2173 | CCAACCAACCAA | 3prime | dltD protein |
| SP2173 | CCCCCC | 5prime | dltD protein |
| SP2178 | CCCCCC | promot | conserved hypothetical protein, interruption |
| SP2182 | GGGGGG | middle | hypothetical protein |
| SP2190 | TTTTTTTT | middle | choline binding protein A |
| SP2190 | TTTTTTTT | 5prime | choline binding protein A |
| SP2193 | TCTCTCTC | promot | DNA-binding response regulator |
| SP2195 | CTCTCTCT | promot | transcriptional regulator CtsR |
| SP2207 | CTCTCTCT | 3prime | competence protein ComF, putative |
| SP2211 | TCTCTCTC | middle | IS66 family element, Orf3, degenerate |
| SP2216 | CTGCTGCTGCTGC | 3prime | secreted 45 kd protein |
| SP2220 | CCCCCC | 3prime | ABC transporter, ATP-binding protein |
| SP2221 | CTCTCTCT | middle | ABC transporter, ATP-binding protein |
| SP2223 | TTTTTTTT | 5prime | conserved hypothetical protein |
| SP2224 | TCTCTCTC | 5prime | peptidase, M16 family |
| SP2233 | TTTTTTTT | promot | hypothetical protein |
| SP2236 | GAGAGAGA | middle | putative sensor histidine kinase ComD |
| SP2240 | GAGAGAGA | 5prime | spspoJ protein |
| (a) Iterative DNA motifs (k-nucleotide repeats), including homopolymeric tracts, were searched in the TIGR4 genome sequence using the REPEATS program [G. Benson, M. S. Waterman, Nucleic Acids Res 22, 4828 (1994)]. The minimum length of homopolymeric tracts was set to 8 for A and T, and 6 for G and C; 4 tandem copies of di- and trinucleotides; and 3 copies of tetra-, penta- and hexanucleotides. Heptanucleotides and above were not found in 3 or more copies, except for the imperfect repeats in SP1772. The ratio of observed frequency of homopolymeric tracts versus their expected frequency was performed by means of Markov chains analysis as described [N. J. Saunders et al., Mol Microbiol 37, 207 (2000)]. It revealed that G or C tracts of size 8 and A or T tracts of size 10 and 11 bp are slightly over-represented. |
| Supplemental Table 6. Comparative genome hybridizations (a). |
| Gene absent in strain D39 | Gene absent in strain R6 | Description |
| SP0067 | SP0067 | hypothetical protein |
| SP0069 | SP0069 | choline binding protein I |
| SP0071 | SP0071 | immunoglobulin A1 protease |
| SP0074 | acetyltransferase, CysE/LacA/LpxA/NodL family |
| SP0163 | transcriptional regulator PlcR, putative |
| SP0165 | SP0165 | flavoprotein |
| SP0166 | SP0166 | pyridoxal-dependent decarboxylase, Orn/Lys/Arg family |
| SP0167 | hypothetical protein |
| SP0168 | SP0168 | macrolide efflux protein, putative |
| SP0298 | conserved hypothetical protein |
| SP0328 | IS1380-Spn1, transposase |
| SP0343 | IS1380-Spn1, transposase |
| SP0347 | capsular polysaccharide biosynthesis protein Cps4B |
| SP0349 | capsular polysaccharide biosynthesis protein Cps4D |
| SP0350 | SP0350 | capsular polysaccharide biosynthesis protein Cps4E |
| SP0351 | SP0351 | capsular polysaccharide biosynthesis protein Cps4F |
| SP0352 | SP0352 | capsular polysaccharide biosynthesis protein Cps4G |
| SP0353 | SP0353 | capsular polysaccharide biosynthesis protein Cps4H |
| SP0354 | SP0354 | hypothetical protein |
| SP0355 | SP0355 | hypothetical protein |
| SP0356 | SP0356 | O-antigen transporter RfbX, putative |
| SP0357 | SP0357 | UDP-N-acetylglucosamine-2-epimerase |
| SP0358 | SP0358 | capsular polysaccharide biosynthesis protein cps4J |
| SP0379 | conserved hypothetical protein |
| SP0380 | hypothetical protein |
| SP0460 | IS1167, transposase |
| SP0461 | SP0461 | transcriptional regulator, putative |
| SP0463 | SP0463 | cell wall surface anchor family protein |
| SP0464 | | cell wall surface anchor family protein |
| SP0466 | SP0466 | sortase, putative |
| SP0467 | SP0467 | sortase, putative |
| SP0468 | SP0468 | sortase, putative |
| SP0495 | IS1380-Spn1, transposase |
| SP0539 | bacteriocin BlpM |
| SP0544 | immunity protein BlpX |
| SP0666 | conserved hypothetical protein |
| SP0714 | IS1380-Spn1, transposase |
| SP0826 | hypothetical protein |
| SP0889 | | hypothetical protein |
| SP0890 | SP0890 | integrase/recombinase, phage integrase family |
| SP0891 | SP0891 | type I restriction-modification system, S subunit, putative |
| SP1055 | SP1055 | Tn5252, Orf 9 protein |
| SP1056 | SP1056 | Tn5252, relaxase |
| SP1057 | SP1057 | transcriptional regulator PlcR, putative |
| SP1059 | SP1059 | hypothetical protein |
| SP1061 | SP1061 | protein kinase, putative |
| SP1062 | ABC transporter, ATP-binding protein |
| SP1063 | SP1063 | ABC-2 transporter, permease protein, putative |
| SP1129 | integrase/recombinase, phage integrase family |
| SP1130 | SP1130 | transcriptional regulator |
| SP1131 | | transcriptional regulator, putative |
| SP1132 | | hypothetical protein |
| SP1134 | SP1134 | hypothetical protein |
| SP1135 | SP1135 | hypothetical protein |
| SP1136 | SP1136 | conserved domain protein |
| SP1137 | SP1137 | GTP-binding protein, putative |
| SP1139 | SP1139 | hypothetical protein |
| SP1141 | hypothetical protein |
| SP1142 | | hypothetical protein |
| SP1143 | conserved hypothetical protein |
| SP1189 | hypothetical protein |
| SP1292 | SAP domain protein |
| SP1315 | | v-type sodium ATP synthase, subunit D |
| SP1316 | SP1316 | v-type sodium ATP synthase, subunit B |
| SP1317 | SP1317 | v-type sodium ATP synthase, subunit A |
| SP1319 | SP1319 | v-type sodium ATP synthase, subunit C |
| SP1320 | v-type sodium ATP synthase, subunit E |
| SP1321 | SP1321 | v-type sodium ATP synthase, subunit K |
| SP1322 | SP1322 | v-type sodium ATP synthase, subunit I |
| SP1324 | SP1324 | ROK family protein |
| SP1325 | SP1325 | oxidoreductase, Gfo/Idh/MocA family |
| SP1326 | SP1326 | neuraminidase, putative |
| SP1327 | | conserved hypothetical protein |
| SP1329 | SP1329 | N-acetylneuraminate lyase |
| SP1330 | SP1330 | N-acetylmannosamine-6-P epimerase, putative |
| SP1331 | SP1331 | phosphosugar-binding transcriptional regulator, putative |
| SP1336 | type II DNA modification methyltransferase Spn5252IP |
| SP1337 | IS1380-Spn1, transposase |
| SP1352 | SP1352 | IS1380-Spn1, transposase |
| SP1439 | IS1380-Spn1, transposase |
| SP1503 | SP1503 | IS1380-Spn1, transposase |
| SP1616 | ribulose-phosphate 3-epimerase family protein |
| SP1617 | PTS system, IIC component |
| SP1618 | PTS system, IIB component |
| SP1619 | PTS system, IIA component |
| SP1620 | PTS system, nitrogen regulatory component IIA, putative |
| SP1621 | transcription antiterminator BglG family protein, authentic frameshift |
| SP1622 | transposase, IS200 family |
| SP1755 | hypothetical protein |
| SP1757 | conserved hypothetical protein |
| SP1758 | SP1758 | glycosyl transferase, group 1 |
| SP1759 | SP1759 | preprotein translocase, SecA subunit |
| SP1760 | SP1760 | conserved domain protein |
| SP1761 | hypothetical protein |
| SP1762 | SP1762 | hypothetical protein |
| SP1763 | SP1763 | preprotein translocase SecY family protein |
| SP1764 | SP1764 | glycosyl transferase, family 2 |
| SP1765 | SP1765 | glycosyl transferase, family 8 |
| SP1766 | SP1766 | glycosyl transferase, family 8 |
| SP1770 | SP1770 | glycosyl transferase, family 8 |
| SP1771 | SP1771 | glycosyl transferase, family 2/glycosyl transferase family 8 |
| SP1772 | SP1772 | cell wall surface anchor family protein |
| SP1793 | hypothetical protein |
| SP1796 | ABC transporter, substrate-binding protein |
| SP1797 | ABC transporter, permease protein |
| (a) This method is used to identify genomic differences between the TIGR4 isolate and strains R6 and D39. All the predicted genes from the TIGR4 isolate were amplified by PCR and arrayed on glass microscope slides as previously described [S. Peterson, R. T. Cline, H. Tettelin, V. Sharov, D. A. Morrison, J Bacteriol 182, 6192 (2000)]. Genomic DNA for comparative genome hybridization studies was labeled according to protocols provided by J. DeRisi (www.microarrays.org/Pdfs/GenomicDNALabel_B.pdf) except that genomic DNA was not digested or sheared prior to labeling. Arrays were scanned using a GenePix 4000B scanner from Axon Inc. and individual hybridization signals quantitated using TIGR SPOTFINDER [P. Hegde et al., Biotechniques 29, 548 (2000)]. |
| Supplemental Table 7. Regions of atypical nucleotide composition (a). |
| Score: 895 (a), %GC: 49.5 |
| SP0014 | transcriptional regulator ComX1 |
| SP0015 | IS630-Spn1, transposase Orf1 |
| SP0016 | IS630-Spn1, transposase Orf2 |
| Score: 749.6, %GC: 29.5 |
| SP0131 | IS630-Spn1, transposase Orf2, degenerate |
| SP0132 | IS630-Spn1, transposase Orf1, degenerate |
| SP0133 | hypothetical protein |
| SP0134 | hypothetical protein |
| SP0135 | glycosyl transferase, putative |
| SP0136 | glycosyl transferase, family 2 |
| SP0137 | ABC transporter, ATP-binding protein |
| SP0138 | hypothetical protein |
| SP0139 | conserved domain protein |
| SP0140 | UDP-glucose 6-dehydrogenase, authentic frameshift |
| SP0141 | transcriptional regulator |
| Score: 938.4, %GC: 28.1 |
| SP0163 | transcriptional regulator PlcR, putative |
| SP0164 | hypothetical protein |
| SP0165 | flavoprotein |
| SP0166 | pyridoxal-dependent decarboxylase, Orn/Lys/Arg family |
| SP0167 | hypothetical protein |
| SP0168 | macrolide efflux protein, putative |
| SP0169 | lactose phosphotransferase system repressor, degenerate |
| SP0170 | hypothetical protein |
| SP0171 | ROK family protein |
| SP0172 | hypothetical protein |
| SP0173 | DNA mismatch repair protein HexB |
| Score: 645.5, %GC: 43.1 |
| SP0210 | ribosomal protein L4 |
| SP0211 | ribosomal protein L23 |
| SP0212 | ribosomal protein L2 |
| SP0213 | ribosomal protein S19 |
| SP0214 | ribosomal protein L22 |
| SP0215 | ribosomal protein S3 |
| SP0216 | ribosomal protein L16 |
| SP0217 | ribosomal protein L29 |
| SP0218 | ribosomal protein S17 |
| Score: 651, %GC: 29.9 |
| SP0350 | capsular polysaccharide biosynthesis protein Cps4E |
| SP0351 | capsular polysaccharide biosynthesis protein Cps4F |
| SP0352 | capsular polysaccharide biosynthesis protein Cps4G |
| SP0353 | capsular polysaccharide biosynthesis protein Cps4H |
| SP0354 | hypothetical protein |
| Score: 663, %GC: 29.8 |
| SP0568 | valyl-tRNA synthetase |
| SP0569 | type II DNA modification methyltransferase, truncation |
| SP0570 | conserved domain protein |
| SP0571 | cell filamentation protein Fic-related protein |
| Score: 626, %GC: 30.4 |
| SP0575 | helicase, putative |
| SP0576 | transcription antiterminator Lict |
| SP0577 | PTS system, beta-glucosides-specific IIABC components |
| Score: 620.5, %GC: 31.8 |
| SP0664 | zinc metalloprotease ZmpB, putative |
| SP0665 | chorismate binding enzyme |
| Score: 947.7, %GC: 29.1 |
| SP0690 | cell division protein DivIB |
| SP0691 | hypothetical protein |
| SP0692 | hypothetical protein |
| SP0693 | hypothetical protein |
| SP0694 | conserved domain protein |
| SP0695 | HesA/MoeB/ThiF family protein |
| SP0696 | hypothetical protein |
| SP0697 | ABC transporter, ATP-binding protein, authentic point mutation |
| SP0698 | hypothetical protein |
| SP0699 | hypothetical protein |
| SP0700 | transposase, IS30 family, degenerate |
| Score: 744.7, %GC: 29.8 |
| SP1029 | RNA methyltransferase, TrmA family |
| SP1030 | conserved hypothetical protein |
| SP1031 | hypothetical protein |
| SP1032 | iron-compound ABC transporter, iron compound-binding protein |
| SP1033 | iron-compound ABC transporter, permease protein |
| SP1034 | iron-compound ABC transporter, permease protein |
| SP1035 | iron-compound ABC transporter, ATP-binding protein |
| SP1036 | hypothetical protein |
| SP1037 | type II restriction endonuclease, putative |
| SP1038 | hypothetical protein |
| SP1039 | hypothetical protein |
| SP1040 | site-specific recombinase, resolvase family |
| Score: 921.4, %GC: 28.3 |
| SP1056 | Tn5252, relaxase |
| SP1057 | transcriptional regulator PlcR, putative |
| SP1058 | hypothetical protein |
| SP1059 | hypothetical protein |
| SP1060 | hypothetical protein |
| SP1061 | protein kinase, putative |
| SP1062 | ABC transporter, ATP-binding protein |
| SP1063 | ABC-2 transporter, permease protein, putative |
| SP1064 | transposase, IS200 family |
| Score: 647, %GC: 30.4 |
| SP1129 | integrase/recombinase, phage integrase family |
| SP1130 | transcriptional regulator |
| SP1131 | transcriptional regulator, putative |
| SP1132 | hypothetical protein |
| SP1133 | hypothetical protein |
| SP1134 | hypothetical protein |
| Score: 721, %GC: 29.2 |
| SP1317 | v-type sodium ATP synthase, subunit A |
| SP1318 | v-type sodium ATP synthase, subunit G |
| SP1319 | v-type sodium ATP synthase, subunit C |
| SP1320 | v-type sodium ATP synthase, subunit E |
| SP1321 | v-type sodium ATP synthase, subunit K |
| Score: 732.8, %GC: 29.4 |
| SP1337 | IS1380-Spn1, transposase |
| SP1338 | hypothetical protein |
| SP1339 | hypothetical protein |
| SP1340 | hypothetical protein |
| SP1341 | ABC transporter, ATP-binding protein |
| SP1342 | toxin secretion ABC transporter, ATP-binding/permease protein |
| SP1343 | prolyl oligopeptidase family protein |
| Score: 639, %GC: 31.2 |
| SP1422 | hypothetical protein |
| SP1423 | transcriptional repressor, putative |
| SP1424 | hypothetical protein |
| SP1425 | hypothetical protein |
| SP1426 | ABC transporter, ATP-binding protein |
| SP1427 | peptidase, U32 family |
| SP1428 | conserved hypothetical protein |
| SP1429 | peptidase, U32 family |
| SP1430 | type II restriction endonuclease, putative, authentic point mutation |
| SP1431 | type II DNA modification methyltransferase, putative |
| SP1432 | hypothetical protein |
| SP1433 | transcriptional regulator, araC family |
| SP1434 | ABC transporter, ATP-binding/permease protein |
| SP1435 | ABC transporter, ATP-binding protein |
| SP1436 | hypothetical protein |
| SP1437 | conserved domain protein |
| SP1438 | ABC transporter, ATP-binding protein |
| SP1439 | IS1380-Spn1, transposase |
| Score: 3152.6, %GC: 54.7 |
| SP1769 | glycosyl transferase, putative, authentic frameshift |
| SP1770 | glycosyl transferase, family 8 |
| SP1771 | glycosyl transferase, family 2/glycosyl transferase family 8 |
| SP1772 | cell wall surface anchor family protein |
| Score: 614.5, %GC: 29.8 |
| SP1799 | sugar-binding transcriptional regulator, LacI family |
| SP1800 | transcriptional activator, putative |
| SP1801 | conserved hypothetical protein |
| SP1802 | hypothetical protein |
| Score: 697, %GC: 29.4 |
| SP1819 | hypothetical protein |
| SP1820 | hypothetical protein |
| SP1821 | sugar-binding transcriptional regulator, LacI family |
| SP1822 | conserved domain protein |
| SP1823 | MgtC/SapB family protein |
| SP1824 | ABC transporter, permease protein |
| Score: 883.8, %GC: 28.4 |
| SP1828 | UDP-glucose 4-epimerase |
| SP1829 | galactose-1-phosphate uridylyltransferase |
| SP1830 | phosphate transport system regulatory protein PhoU, putative |
| SP1831 | hypothetical protein |
| SP1832 | hypothetical protein |
| SP1833 | cell wall surface anchor family protein |
| Score: 978.4, %GC: 49.8 |
| SP1900 | BirA bifunctional protein |
| SP1901 | RNA methyltransferase, TrmA family |
| Score: 694, %GC: 30.1 |
| SP1946 | transcriptional regulator PlcR, putative |
| SP1947 | hypothetical protein |
| SP1948 | conserved domain protein |
| SP1949 | hypothetical protein |
| SP1950 | bacteriocin formation protein, putative |
| SP1951 | conserved hypothetical protein |
| SP1952 | hypothetical protein |
| SP1953 | toxin secretion ABC transporter, ATP-binding/permease protein |
| SP1954 | serine protease, subtilase family, authentic frameshift |
| SP1955 | hypothetical protein |
| SP1956 | hypothetical protein |
| Score: 657, %GC: 45.5 |
| SP1961 | DNA-directed RNA polymerase, beta subunit |
| Score: 914.1, %GC: 49.5 |
| SP2005 | hypothetical protein |
| SP2006 | transcriptional regulator ComX2 |
| SP2007 | transcription antitermination protein NusG |
| Score: 939.1, %GC: 49.7 |
| SP2067 | hypothetical protein |
| SP2068 | cytidine/deoxycytidylate deaminase family protein |
| SP2069 | glutamyl-tRNA synthetase |
| Score: 700, %GC: 30.3 |
| SP2136 | choline binding protein PcpA |
| SP2137 | IS1381, transposase OrfA, internal deletion |
| (a) Regions of atypical nucleotide composition were identified by the x2 analysis: the distribution of all 64 trinucleotides (3mers) was computed for the complete genome in all 6 reading frames, followed by the 3mer distribution in 2000 bp windows. Windows overlapped by 1500 bp. For each window, the x2 statistic on the difference between its 3mer content and that of the whole genome was computed. The most atypical regions, with a score of 600 and above, were considered in this analysis. |
|
|