Related Content
Search Google Scholar for:
|
|
Science 8 October 1993: Vol. 262. no. 5131, pp. 208 - 214 DOI: 10.1126/science.8211139
|
|
Articles
Science, Vol 262, Issue 5131, 208-214
Copyright © 1993 by American Association for the Advancement of Science
Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment
CE Lawrence,
SF Altschul,
MS Boguski,
JS Liu,
AF Neuwald,
and
JC Wootton
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894.
A wealth of protein and DNA sequence data is being generated by genome projects and other sequencing efforts. A crucial barrier to deciphering these sequences and understanding the relations among them is the difficulty of detecting subtle local residue patterns common to multiple sequences. Such patterns frequently reflect similar molecular structures and biological properties. A mathematical definition of this "local multiple alignment" problem suitable for full computer automation has been used to develop a new and sensitive algorithm, based on the statistical method of iterative sampling. This algorithm finds an optimized local alignment model for N sequences in N-linear time, requiring only seconds on current workstations, and allows the simultaneous detection and optimization of multiple patterns and pattern repeats. The method is illustrated as applied to helix-turn-helix proteins, lipocalins, and prenyltransferases.
THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:
- Accurate recognition of cis-regulatory motifs with the correct lengths in prokaryotic genomes.
- G. Li, B. Liu, and Y. Xu (2009)
Nucleic Acids Res.
| Abstract »
| Full Text »
| PDF »
- info-gibbs: a motif discovery algorithm that directly optimizes information content during sampling.
- M. Defrance and J. van Helden (2009)
Bioinformatics
25, 2715-2722
| Abstract »
| Full Text »
| PDF »
- Mechanisms and Evolution of Control Logic in Prokaryotic Transcriptional Regulation.
- S. A. F. T. van Hijum, M. H. Medema, and O. P. Kuipers (2009)
Microbiol. Mol. Biol. Rev.
73, 481-509
| Abstract »
| Full Text »
| PDF »
- Finding sequence motifs in prokaryotic genomes--a brief practical guide for a microbiologist.
- J. Mrazek (2009)
Brief Bioinform
10, 525-536
| Abstract »
| Full Text »
| PDF »
- KIRMES: kernel-based identification of regulatory modules in euchromatic sequences.
- S. J. Schultheiss, W. Busch, J. U. Lohmann, O. Kohlbacher, and G. Ratsch (2009)
Bioinformatics
25, 2126-2133
| Abstract »
| Full Text »
| PDF »
- How Much Does It Cost?: Optimization of Costs in Sequence Analysis of Social Science Data.
- J.-A. Gauthier, E. D. Widmer, P. Bucher, and C. Notredame (2009)
Sociological Methods Research
38, 197-231
| Abstract »
| PDF »
- Rapid detection, classification and accurate alignment of up to a million or more related protein sequences.
- A. F. Neuwald (2009)
Bioinformatics
25, 1869-1875
| Abstract »
| Full Text »
| PDF »
- Identifying regulatory elements in eukaryotic genomes.
- L. Narlikar and I. Ovcharenko (2009)
Brief Funct Genomic Proteomic
8, 215-230
| Abstract »
| Full Text »
| PDF »
- Domain Interaction Footprint: a multi-classification approach to predict domain-peptide interactions.
- C. Schillinger, P. Boisguerin, and G. Krause (2009)
Bioinformatics
25, 1632-1639
| Abstract »
| Full Text »
| PDF »
- PSI-BLAST pseudocounts and the minimum description length principle.
- S. F. Altschul, E. M. Gertz, R. Agarwala, A. A. Schaffer, and Y.-K. Yu (2009)
Nucleic Acids Res.
37, 815-824
| Abstract »
| Full Text »
| PDF »
- Pseudocounts for transcription factor binding sites.
- K. Nishida, M. C. Frith, and K. Nakai (2009)
Nucleic Acids Res.
37, 939-944
| Abstract »
| Full Text »
| PDF »
- ARCS-Motif: discovering correlated motifs from unaligned biological sequences.
- S. Zhang, W. Su, and J. Yang (2009)
Bioinformatics
25, 183-189
| Abstract »
| Full Text »
| PDF »
- Discovery of phosphorylation motif mixtures in phosphoproteomics data.
- A. Ritz, G. Shakhnarovich, A. R. Salomon, and B. J. Raphael (2009)
Bioinformatics
25, 14-21
| Abstract »
| Full Text »
| PDF »
- Evolutionary computation for discovery of composite transcription factor binding sites.
- G. B. Fogel, V. W. Porto, G. Varga, E. R. Dow, A. M. Craven, D. M. Powers, H. B. Harlow, E. W. Su, J. E. Onyia, and C. Su (2008)
Nucleic Acids Res.
36, e142
| Abstract »
| Full Text »
| PDF »
- Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training.
- V. Ter-Hovhannisyan, A. Lomsadze, Y. O. Chernoff, and M. Borodovsky (2008)
Genome Res.
18, 1979-1990
| Abstract »
| Full Text »
| PDF »
- Position-dependent motif characterization using non-negative matrix factorization.
- L. N. Hutchins, S. M. Murphy, P. Singh, and J. H. Graber (2008)
Bioinformatics
24, 2684-2690
| Abstract »
| Full Text »
| PDF »
- Seeder: discriminative seeding DNA motif discovery.
- F. Fauteux, M. Blanchette, and M. V. Stromvik (2008)
Bioinformatics
24, 2303-2307
| Abstract »
| Full Text »
| PDF »
- A transdimensional Bayesian model for pattern recognition in DNA sequences.
- S. M. Li, J. Wakefield, and S. Self (2008)
Biostat.
9, 668-685
| Abstract »
| Full Text »
| PDF »
- Characteristics and Prediction of RNA Editing Sites in Transcripts of the Moss Takakia lepidozioides Chloroplast.
- K. Yura, Y. Miyata, T. Arikawa, M. Higuchi, and M. Sugita (2008)
DNA Res
15, 309-321
| Abstract »
| Full Text »
| PDF »
- Ab initio identification of functionally interacting pairs of cis-regulatory elements.
- B. A. Friedman, M. B. Stadler, N. Shomron, Y. Ding, and C. B. Burge (2008)
Genome Res.
18, 1643-1651
| Abstract »
| Full Text »
| PDF »
- GIMSAN: a Gibbs motif finder with significance analysis.
- P. Ng and U. Keich (2008)
Bioinformatics
24, 2256-2257
| Abstract »
| Full Text »
| PDF »
- The cis-regulatory map of Shewanella genomes.
- J. Liu, X. Xu, and G. D. Stormo (2008)
Nucleic Acids Res.
36, 5376-5390
| Abstract »
| Full Text »
| PDF »
- Efficient representation and P-value computation for high-order Markov motifs.
- P. G. S. da Fonseca, K. S. Guimaraes, and M.-F. Sagot (2008)
Bioinformatics
24, i160-i166
| Abstract »
| Full Text »
| PDF »
- Cross-species de novo identification of cis-regulatory modules with GibbsModule: Application to gene regulation in embryonic stem cells.
- D. Xie, J. Cai, N.-Y. Chia, H. H. Ng, and S. Zhong (2008)
Genome Res.
18, 1325-1335
| Abstract »
| Full Text »
| PDF »
- Extracting sequence features to predict protein-DNA interactions: a comparative study.
- Q. Zhou and J. S. Liu (2008)
Nucleic Acids Res.
36, 4137-4148
| Abstract »
| Full Text »
| PDF »
- W-AlignACE: an improved Gibbs sampling algorithm based on more accurate position weight matrices learned from sequence and gene expression/ChIP-chip data.
- X. Chen, L. Guo, Z. Fan, and T. Jiang (2008)
Bioinformatics
24, 1121-1128
| Abstract »
| Full Text »
| PDF »
- AIMIE: a web-based environment for detection and interpretation of significant sequence motifs in prokaryotic genomes.
- J. Mrazek, S. Xie, X. Guo, and A. Srivastava (2008)
Bioinformatics
24, 1041-1048
| Abstract »
| Full Text »
| PDF »
- Prediction of Cancer Driver Mutations in Protein Kinases.
- A. Torkamani and N. J. Schork (2008)
Cancer Res.
68, 1675-1682
| Abstract »
| Full Text »
| PDF »
- TFBS identification based on genetic algorithm with combined representations and adaptive post-processing.
- T.-M. Chan, K.-S. Leung, and K.-H. Lee (2008)
Bioinformatics
24, 341-349
| Abstract »
| Full Text »
| PDF »
- Genome wide screens in yeast to identify potential binding sites and target genes of DNA-binding proteins.
- J. Zeng, J. Yan, T. Wang, D. Mosbrook-Davis, K. T. Dolan, R. Christensen, G. D. Stormo, D. Haussler, R. H. Lathrop, R. K. Brachmann, et al. (2008)
Nucleic Acids Res.
36, e8
| Abstract »
| Full Text »
| PDF »
- Integrating quantitative information from ChIP-chip experiments into motif finding.
- H. Shim and S. Keles (2008)
Biostat.
9, 51-65
| Abstract »
| Full Text »
| PDF »
- A profile-based deterministic sequential Monte Carlo algorithm for motif discovery.
- K.-C. Liang, X. Wang, and D. Anastassiou (2008)
Bioinformatics
24, 46-55
| Abstract »
| Full Text »
| PDF »
- Biclustering as a method for RNA local multiple sequence alignment.
- S. Wang, R. R. Gutell, and D. P. Miranker (2007)
Bioinformatics
23, 3289-3296
| Abstract »
| Full Text »
| PDF »
- Modeling the adaptive immune system: predictions and simulations.
- C. Lundegaard, O. Lund, C. Kesmir, S. Brunak, and M. Nielsen (2007)
Bioinformatics
23, 3265-3275
| Abstract »
| Full Text »
| PDF »
- Identifying cis-regulatory elements by statistical analysis and phylogenetic footprinting and analyzing their coexistence and related gene ontology.
- W. Shi, W. Zhou, and D. Xu (2007)
Physiol Genomics
31, 374-384
| Abstract »
| Full Text »
| PDF »
- Differentiation of core promoter architecture between plants and mammals revealed by LDSS analysis.
- Y. Y. Yamamoto, H. Ichida, T. Abe, Y. Suzuki, S. Sugano, and J. Obokata (2007)
Nucleic Acids Res.
35, 6219-6226
| Abstract »
| Full Text »
| PDF »
- C. elegans sequences that control trans-splicing and operon pre-mRNA processing.
- J. H. Graber, J. Salisbury, L. N. Hutchins, and T. Blumenthal (2007)
RNA
13, 1409-1426
| Abstract »
| Full Text »
| PDF »
- Melina II: a web tool for comparisons among several predictive algorithms to find potential motifs from promoter regions.
- T. Okumura, H. Makiguchi, Y. Makita, R. Yamashita, and K. Nakai (2007)
Nucleic Acids Res.
35, W227-W231
| Abstract »
| Full Text »
| PDF »
- Cross-species microarray analysis with the OSCAR system suggests an INSR->Pax6->NQO1 neuro-protective pathway in aging and Alzheimer's disease.
- Y. Lu, X. He, and S. Zhong (2007)
Nucleic Acids Res.
35, W105-W114
| Abstract »
| Full Text »
| PDF »
- Combined experimental and computational approaches to study the regulatory elements in eukaryotic genes.
- N. A. Kolchanov, T. I. Merkulova, E. V. Ignatieva, E. A. Ananko, D. Yu. Oshchepkov, V. G. Levitsky, G. V. Vasiliev, N. V. Klimova, V. M. Merkulov, and T. C. Hodgman (2007)
Brief Bioinform
| Abstract »
| Full Text »
| PDF »
- Multiple Controls Regulate the Expression of mobE, an HNH Homing Endonuclease Gene Embedded within a Ribonucleotide Reductase Gene of Phage Aeh1.
- E. A. Gibb and D. R. Edgell (2007)
J. Bacteriol.
189, 4648-4661
| Abstract »
| Full Text »
| PDF »
- Nucleotide variation of regulatory motifs may lead to distinct expression patterns.
- L. Segal, M. Lapidot, Z. Solan, E. Ruppin, Y. Pilpel, and D. Horn (2007)
Bioinformatics
23, i440-i449
| Abstract »
| Full Text »
| PDF »
- Identification of an OCT4 and SRY regulatory module using integrated computational and experimental genomics approaches.
- V. X. Jin, H. O'Geen, S. Iyengar, R. Green, and P. J. Farnham (2007)
Genome Res.
17, 807-817
| Abstract »
| Full Text »
| PDF »
- Detection of DNA structural motifs in functional genomic elements.
- J. A. Greenbaum, S. C.J. Parker, and T. D. Tullius (2007)
Genome Res.
17, 940-946
| Abstract »
| Full Text »
| PDF »
- A sequential Monte Carlo EM approach to the transcription factor binding site identification problem.
- E. S. Jackson and W. J. Fitzgerald (2007)
Bioinformatics
23, 1313-1320
| Abstract »
| Full Text »
| PDF »
- Genomic characterization of Gli-activator targets in sonic hedgehog-mediated neural patterning.
- S. A. Vokes, H. Ji, S. McCuine, T. Tenzen, S. Giles, S. Zhong, W. J. R. Longabaugh, E. H. Davidson, W. H. Wong, and A. P. McMahon (2007)
Development
134, 1977-1989
| Abstract »
| Full Text »
| PDF »
- Connecting protein structure with predictions of regulatory sites.
- A. V. Morozov and E. D. Siggia (2007)
PNAS
104, 7068-7073
| Abstract »
| Full Text »
| PDF »
- Position dependencies in transcription factor binding sites.
- A. Tomovic and E. J. Oakeley (2007)
Bioinformatics
23, 933-941
| Abstract »
| Full Text »
| PDF »
- Positional clustering improves computational binding site detection and identifies novel cis-regulatory sites in mammalian GABAA receptor subunit genes.
- T. E. Reddy, B. E. Shakhnovich, D. S. Roberts, S. J. Russek, and C. DeLisi (2007)
Nucleic Acids Res.
35, e20
| Abstract »
| Full Text »
| PDF »
- Integrating transcription factor binding site information with gene expression datasets.
- I. B. Jeffery, S. F. Madden, P. A. McGettigan, G. Perriere, A. C. Culhane, and D. G. Higgins (2007)
Bioinformatics
23, 298-305
| Abstract »
| Full Text »
| PDF »
- Complementary intron sequence motifs associated with human exon repetition: a role for intragenic, inter-transcript interactions in gene expression.
- R. J. Dixon, I. C. Eperon, and N. J. Samani (2007)
Bioinformatics
23, 150-155
| Abstract »
| Full Text »
| PDF »
- Systematic variation in mRNA 3'-processing signals during mouse spermatogenesis.
- D. Liu, J. M. Brockman, B. Dass, L. N. Hutchins, P. Singh, J. R. McCarrey, C. C. MacDonald, and J. H. Graber (2007)
Nucleic Acids Res.
35, 234-246
| Abstract »
| Full Text »
| PDF »
- SwissRegulon: a database of genome-wide annotations of regulatory sites.
- M. Pachkov, I. Erb, N. Molina, and E. van Nimwegen (2007)
Nucleic Acids Res.
35, D127-D131
| Abstract »
| Full Text »
| PDF »
- MUSA: a parameter free algorithm for the identification of biologically significant motifs.
- N. D. Mendes, A. C. Casimiro, P. M. Santos, I. Sa-Correia, A. L. Oliveira, and A. T. Freitas (2006)
Bioinformatics
22, 2996-3002
| Abstract »
| Full Text »
| PDF »
- Amino acid residue doublet propensity in the protein-RNA interface and its application to RNA interface prediction.
- O. T. P. Kim, K. Yura, and N. Go (2006)
Nucleic Acids Res.
| Abstract »
| Full Text »
| PDF »
- A comparative analysis of genome-wide chromatin immunoprecipitation data for mammalian transcription factors.
- H. Ji, S. A. Vokes, and W. H. Wong (2006)
Nucleic Acids Res.
34, e146
| Abstract »
| Full Text »
| PDF »
- Identification of degenerate motifs using position restricted selection and hybrid ranking combination.
- C.-H. Peng, J.-T. Hsu, Y.-S. Chung, Y.-J. Lin, W.-Y. Chow, D. F. Hsu, and C. Y. Tang (2006)
Nucleic Acids Res.
34, 6379-6391
| Abstract »
| Full Text »
| PDF »
- Locating mammalian transcription factor binding sites: A survey of computational and experimental techniques.
- L. Elnitski, V. X. Jin, P. J. Farnham, and S. J.M. Jones (2006)
Genome Res.
16, 1455-1464
| Abstract »
| Full Text »
| PDF »
- Adding sequence context to a Markov background model improves the identification of regulatory elements.
- N.-K. Kim, K. Tharakaraman, and J. L. Spouge (2006)
Bioinformatics
22, 2870-2875
| Abstract »
| Full Text »
| PDF »
- Multiple alignment of protein sequences with repeats and rearrangements.
- T. M. Phuong, C. B. Do, R. C. Edgar, and S. Batzoglou (2006)
Nucleic Acids Res.
34, 5932-5942
| Abstract »
| Full Text »
| PDF »
- Bioinformatics-driven, rational engineering of protein thermostability.
- M. K. DiTursi, S.-J. Kwon, P. J. Reeder, and J. S. Dordick (2006)
Protein Eng. Des. Sel.
19, 517-524
| Abstract »
| Full Text »
| PDF »
- Using RNA secondary structures to guide sequence motif finding towards single-stranded regions.
- M. Hiller, R. Pudimat, A. Busch, and R. Backofen (2006)
Nucleic Acids Res.
34, e117
| Abstract »
| Full Text »
| PDF »
- Finding motifs from all sequences with and without binding sites.
- H. C. M. Leung and F. Y. L. Chin (2006)
Bioinformatics
22, 2217-2223
| Abstract »
| Full Text »
| PDF »
- An increased specificity score matrix for the prediction of SF2/ASF-specific exonic splicing enhancers.
- P. J. Smith, C. Zhang, J. Wang, S. L. Chew, M. Q. Zhang, and A. R. Krainer (2006)
Hum. Mol. Genet.
15, 2490-2508
| Abstract »
| Full Text »
| PDF »
- Temporal Transcriptomic Analysis as Desulfovibrio vulgaris Hildenborough Transitions into Stationary Phase during Electron Donor Depletion.
- M. E. Clark, Q. He, Z. He, K. H. Huang, E. J. Alm, X.-F. Wan, T. C. Hazen, A. P. Arkin, J. D. Wall, J.-Z. Zhou, et al. (2006)
Appl. Envir. Microbiol.
72, 5578-5588
| Abstract »
| Full Text »
| PDF »
- Computational identification of transcriptional regulatory elements in DNA sequence.
- D. GuhaThakurta (2006)
Nucleic Acids Res.
34, 3585-3598
| Abstract »
| Full Text »
| PDF »
- Involvement of the Arabidopsis SWI2/SNF2 Chromatin Remodeling Gene Family in DNA Damage Response and Recombination.
- H. Shaked, N. Avivi-Ragolsky, and A. A. Levy (2006)
Genetics
173, 985-994
| Abstract »
| Full Text »
| PDF »
- Promoter Analysis of MADS-Box Genes in Eudicots Through Phylogenetic Footprinting.
- S. De Bodt, G. Theissen, and Y. Van de Peer (2006)
Mol. Biol. Evol.
23, 1293-1303
| Abstract »
| Full Text »
| PDF »
- Regions of extreme synonymous codon selection in mammalian genes.
- P. Schattner and M. Diekhans (2006)
Nucleic Acids Res.
34, 1700-1710
| Abstract »
| Full Text »
| PDF »
- Statistical and Bayesian approaches to RNA secondary structure prediction..
- Y. DING (2006)
RNA
12, 323-331
| Abstract »
| Full Text »
| PDF »
- Relaxed template specificity in fowl adenovirus 1 DNA replication initiation..
- H. J. Rademaker, F. J. Fallaux, D. J. M. Van den Wollenberg, R. N. De Jong, P. C. Van der Vliet, and R. C. Hoeben (2006)
J. Gen. Virol.
87, 553-562
| Abstract »
| Full Text »
| PDF »
- Statistical significance in biological sequence analysis.
- A. Yu. Mitrophanov and M. Borodovsky (2006)
Brief Bioinform
7, 2-24
- Bioinformatics of alternative splicing and its regulation.
- L. Florea (2006)
Brief Bioinform
7, 55-69
| Abstract »
| Full Text »
| PDF »
|
|