Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.

Site Tools

  • AAAS
  • Subscribe
  • Feedback

Site Search

Search Advanced

Science 30 July 1999:
Vol. 285. no. 5428, pp. 751 - 753
DOI: 10.1126/science.285.5428.751

Reports

Detecting Protein Function and Protein-Protein Interactions from Genome Sequences

Edward M. Marcotte, Matteo Pellegrini, Ho-Leung Ng, Danny W. Rice, Todd O. Yeates, David Eisenberg *

A computational method is proposed for inferring protein interactions from genome sequences on the basis of the observation that some pairs of interacting proteins have homologs in another organism fused into a single protein chain. Searching sequences from many genomes revealed 6809 such putative protein-protein interactions in Escherichia coli and 45,502 in yeast. Many members of these pairs were confirmed as functionally related; computational filtering further enriches for interactions. Some proteins have links to several other proteins; these coupled links appear to represent functional interactions such as complexes or pathways. Experimentally confirmed interacting pairs are documented in a Database of Interacting Proteins.

UCLA-Department of Energy Laboratory of Structural Biology and Molecular Medicine, Departments of Chemistry and Biochemistry and Biological Chemistry, Box 951570, University of California at Los Angeles, Los Angeles, CA 90095-1570, USA.
*   To whom correspondence should be addressed: E-mail: david{at}mbi.ucla.edu


Read the Full Text


THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:
d-Omix: a mixer of generic protein domain analysis tools.
D. Wichadakul, S. Numnark, and S. Ingsriswang (2009)
Nucleic Acids Res. 37, W417-W421
   Abstract »    Full Text »    PDF »
Domain mobility in proteins: functional and evolutionary implications.
M. K. Basu, E. Poliakov, and I. B. Rogozin (2009)
Brief Bioinform 10, 205-216
   Abstract »    Full Text »    PDF »
Fractured genes: a novel genomic arrangement involving new split inteins and a new homing endonuclease family.
B. Dassa, N. London, B. L. Stoddard, O. Schueler-Furman, and S. Pietrokovski (2009)
Nucleic Acids Res. 37, 2560-2573
   Abstract »    Full Text »    PDF »
HAMAP: a database of completely sequenced microbial proteome sets and manually curated microbial protein families in UniProtKB/Swiss-Prot.
T. Lima, A. H. Auchincloss, E. Coudert, G. Keller, K. Michoud, C. Rivoire, V. Bulliard, E. de Castro, C. Lachaize, D. Baratin, et al. (2009)
Nucleic Acids Res. 37, D471-D478
   Abstract »    Full Text »    PDF »
EcID. A database for the inference of functional interactions in E. coli.
E. Andres Leon, I. Ezkurdia, B. Garcia, A. Valencia, and D. Juan (2009)
Nucleic Acids Res. 37, D629-D635
   Abstract »    Full Text »    PDF »
A Bioinformatician's Guide to Metagenomics.
V. Kunin, A. Copeland, A. Lapidus, K. Mavromatis, and P. Hugenholtz (2008)
Microbiol. Mol. Biol. Rev. 72, 557-578
   Abstract »    Full Text »    PDF »
Message-passing algorithms for the prediction of protein domain interactions from protein-protein interaction data.
M. Iqbal, A. A. Freitas, C. G. Johnson, and M. Vergassola (2008)
Bioinformatics 24, 2064-2070
   Abstract »    Full Text »    PDF »
DAhunter: a web-based server that identifies homologous proteins by comparing domain architecture.
B. Lee and D. Lee (2008)
Nucleic Acids Res. 36, W60-W64
   Abstract »    Full Text »    PDF »
GeConT 2: gene context analysis for orthologous proteins, conserved domains and metabolic pathways.
C. E. Martinez-Guerrero, R. Ciria, C. Abreu-Goodger, G. Moreno-Hagelsieb, and E. Merino (2008)
Nucleic Acids Res. 36, W176-W180
   Abstract »    Full Text »    PDF »
Assigning functional linkages to proteins using phylogenetic profiles and continuous phenotypes.
O. Gonzalez and R. Zimmer (2008)
Bioinformatics 24, 1257-1263
   Abstract »    Full Text »    PDF »
Network-guided genetic screening: building, testing and using gene networks to predict gene function.
B. Lehner and I. Lee (2008)
Brief Funct Genomic Proteomic 7, 217-227
   Abstract »    Full Text »    PDF »
Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences.
Y. Guo, L. Yu, Z. Wen, and M. Li (2008)
Nucleic Acids Res. 36, 3025-3030
   Abstract »    Full Text »    PDF »
A review on models and algorithms for motif discovery in protein-protein interaction networks.
G. Ciriello and C. Guerra (2008)
Brief Funct Genomic Proteomic
   Abstract »    Full Text »    PDF »
Evolution of protein domain promiscuity in eukaryotes.
M. K. Basu, L. Carmel, I. B. Rogozin, and E. V. Koonin (2008)
Genome Res. 18, 449-461
   Abstract »    Full Text »    PDF »
Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution.
P. R Kensche, V. van Noort, B. E Dutilh, and M. A Huynen (2008)
J R Soc Interface 5, 151-170
   Abstract »    Full Text »    PDF »
Bacteriome.org an integrated protein interaction database for E. coli.
C. Su, J. M. Peregrin-Alvarez, G. Butland, S. Phanse, V. Fong, A. Emili, and J. Parkinson (2008)
Nucleic Acids Res. 36, D632-D636
   Abstract »    Full Text »    PDF »
AtPID: Arabidopsis thaliana protein interactome database an integrative platform for plant systems biology.
J. Cui, P. Li, G. Li, F. Xu, C. Zhao, Y. Li, Z. Yang, G. Wang, Q. Yu, Y. Li, et al. (2008)
Nucleic Acids Res. 36, D999-D1008
   Abstract »    Full Text »    PDF »
Current trends in the bioinformatic sequence analysis of metabolic pathways in prokaryotes.
M. Brilli, R. Fani, and P. Lio (2008)
Brief Bioinform 9, 34-45
   Abstract »    Full Text »    PDF »
Function, Structure, and Evolution of the RubisCO-Like Proteins and Their RubisCO Homologs.
F. R. Tabita, T. E. Hanson, H. Li, S. Satagopan, J. Singh, and S. Chan (2007)
Microbiol. Mol. Biol. Rev. 71, 576-599
   Abstract »    Full Text »    PDF »
Insight into the haem d1 biosynthesis pathway in heliobacteria through bioinformatics analysis.
J. Xiong, C. E. Bauer, and A. Pancholy (2007)
Microbiology 153, 3548-3562
   Abstract »    Full Text »    PDF »
Protein interactions and disease: computational approaches to uncover the etiology of diseases.
M. G. Kann (2007)
Brief Bioinform 8, 333-346
   Abstract »    Full Text »    PDF »
Quantitative assessment of protein function prediction from metagenomics shotgun sequences.
E. D. Harrington, A. H. Singh, T. Doerks, I. Letunic, C. von Mering, L. J. Jensen, J. Raes, and P. Bork (2007)
PNAS 104, 13913-13918
   Abstract »    Full Text »    PDF »
The Outer Mitochondrial Membrane Protein mitoNEET Contains a Novel Redox-active 2Fe-2S Cluster.
S. E. Wiley, M. L. Paddock, E. C. Abresch, L. Gross, P. van der Geer, R. Nechushtai, A. N. Murphy, P. A. Jennings, and J. E. Dixon (2007)
J. Biol. Chem. 282, 23745-23749
   Abstract »    Full Text »    PDF »
High-precision mapping of protein protein interfaces: an integrated genetic strategy combining en masse mutagenesis and DNA-level parallel analysis on a yeast two-hybrid platform.
M. Pajunen, H. Turakainen, E. Poussu, J. Peranen, M. Vihinen, and H. Savilahti (2007)
Nucleic Acids Res.
   Abstract »    Full Text »    PDF »
CytoSVM: an advanced server for identification of cytokine-receptor interactions.
J.-R. Xu, J.-X. Zhang, B.-C. Han, L. Liang, and Z.-L. Ji (2007)
Nucleic Acids Res. 35, W538-W542
   Abstract »    Full Text »    PDF »
Supervised reconstruction of biological networks with local models.
K. Bleakley, G. Biau, and J.-P. Vert (2007)
Bioinformatics 23, i57-i65
   Abstract »    Full Text »    PDF »
Using genome-context data to identify specific types of functional associations in pathway/genome databases.
M. L. Green and P. D. Karp (2007)
Bioinformatics 23, i205-i211
   Abstract »    Full Text »    PDF »
Structure and Kinetics of Monofunctional Proline Dehydrogenase from Thermus thermophilus.
T. A. White, N. Krishnan, D. F. Becker, and J. J. Tanner (2007)
J. Biol. Chem. 282, 14316-14327
   Abstract »    Full Text »    PDF »
Phylogenetic Methodology for Detecting Protein Interactions.
P. J. Waddell, H. Kishino, and R. Ota (2007)
Mol. Biol. Evol. 24, 650-659
   Abstract »    Full Text »    PDF »
Operon Prediction for Sequenced Bacterial Genomes without Experimental Information.
N. H. Bergman, K. D. Passalacqua, P. C. Hanna, and Z. S. Qin (2007)
Appl. Envir. Microbiol. 73, 846-854
   Abstract »    Full Text »    PDF »
Family relationships: should consensus reign?--consensus clustering for protein families.
M. Nikolski and D. J. Sherman (2007)
Bioinformatics 23, e71-e76
   Abstract »    Full Text »    PDF »
SNAPPI-DB: a database and API of Structures, iNterfaces and Alignments for Protein-Protein Interactions.
E. R. Jefferson, T. P. Walsh, T. J. Roberts, and G. J. Barton (2007)
Nucleic Acids Res. 35, D580-D589
   Abstract »    Full Text »    PDF »
STRING 7--recent developments in the integration and prediction of protein interactions.
C. von Mering, L. J. Jensen, M. Kuhn, S. Chaffron, T. Doerks, B. Kruger, B. Snel, and P. Bork (2007)
Nucleic Acids Res. 35, D358-D362
   Abstract »    Full Text »    PDF »
Constrained models of evolution lead to improved prediction of functional linkage from correlated gain and loss of genes.
D. Barker, A. Meade, and M. Pagel (2007)
Bioinformatics 23, 14-20
   Abstract »    Full Text »    PDF »
Rate and Polarity of Gene Fusion and Fission in Oryza sativa and Arabidopsis thaliana.
Y. Nakamura, T. Itoh, and W. Martin (2007)
Mol. Biol. Evol. 24, 110-121
   Abstract »    Full Text »    PDF »
Expression of a Novel Marine Viral Single-chain Serine Palmitoyltransferase and Construction of Yeast and Mammalian Single-chain Chimera.
G. Han, K. Gable, L. Yan, M. J. Allen, W. H. Wilson, P. Moitra, J. M. Harmon, and T. M. Dunn (2006)
J. Biol. Chem. 281, 39935-39942
   Abstract »    Full Text »    PDF »
Computational approaches for the prediction of protein function in the mitochondrion.
T. Gabaldon (2006)
Am J Physiol Cell Physiol 291, C1121-C1128
   Abstract »    Full Text »    PDF »
Colloquium Papers: Characterization and prediction of protein-protein interactions within and between complexes.
E. Sprinzak, Y. Altuvia, and H. Margalit (2006)
PNAS 103, 14718-14723
   Abstract »    Full Text »    PDF »
A novel structure-based encoding for machine-learning applied to the inference of SH3 domain specificity.
E. Ferraro, A. Via, G. Ausiello, and M. Helmer-Citterich (2006)
Bioinformatics 22, 2333-2339
   Abstract »    Full Text »    PDF »
Biological Functions of Mammalian Nit1, the Counterpart of the Invertebrate NitFhit Rosetta Stone Protein, a Possible Tumor Suppressor.
S. Semba, S.-Y. Han, H. R. Qin, K. A. McCorkell, D. Iliopoulos, Y. Pekarsky, T. Druck, F. Trapasso, C. M. Croce, and K. Huebner (2006)
J. Biol. Chem. 281, 28244-28253
   Abstract »    Full Text »    PDF »
An initial strategy for comparing proteins at the domain architecture level.
K. Lin, L. Zhu, and D.-Y. Zhang (2006)
Bioinformatics 22, 2081-2086
   Abstract »    Full Text »    PDF »
A framework of integrating gene relations from heterogeneous data sources: an experiment on Arabidopsis thaliana.
J. Li, X. Li, H. Su, H. Chen, and D. W. Galbraith (2006)
Bioinformatics 22, 2037-2043
   Abstract »    Full Text »    PDF »
The outcomes of pathway database computations depend on pathway ontology.
M. L. Green and P. D. Karp (2006)
Nucleic Acids Res. 34, 3687-3697
   Abstract »    Full Text »    PDF »
A Plant Locus Essential for Phylloquinone (Vitamin K1) Biosynthesis Originated from a Fusion of Four Eubacterial Genes.
J. Gross, W. K. Cho, L. Lezhneva, J. Falk, K. Krupinska, K. Shinozaki, M. Seki, R. G. Herrmann, and J. Meurer (2006)
J. Biol. Chem. 281, 17189-17196
   Abstract »    Full Text »    PDF »
Reciprocal Illumination in the Gene Content Tree of Life.
E. K. Lienau, R. DeSalle, J. A. Rosenfeld, and P. J. Planet (2006)
Syst Biol 55, 441-453
   Abstract »    Full Text »    PDF »
Toward the structural genomics of complexes: Crystal structure of a PE/PPE protein complex from Mycobacterium tuberculosis.
M. Strong, M. R. Sawaya, S. Wang, M. Phillips, D. Cascio, and D. Eisenberg (2006)
PNAS 103, 8060-8065
   Abstract »    Full Text »    PDF »
Computational modeling of the Plasmodium falciparum interactome reveals protein function on a genome-wide scale.
S. V. Date and C. J. Stoeckert Jr. (2006)
Genome Res. 16, 542-549
   Abstract »    Full Text »    PDF »
Predicting interactions in protein networks by completing defective cliques.
H. Yu, A. Paccanaro, V. Trifonov, and M. Gerstein (2006)
Bioinformatics 22, 823-829
   Abstract »    Full Text »    PDF »
The origins and evolution of functional modules: lessons from protein complexes.
J. B Pereira-Leal, E. D Levy, and S. A Teichmann (2006)
Phil Trans R Soc B 361, 507-517
   Abstract »    Full Text »    PDF »
Haloferax volcanii PitA: an example of functional interaction between the Pfam chlorite dismutase and antibiotic biosynthesis monooxygenase families?.
E. Bab-Dinitz, H. Shmuely, J. Maupin-Furlow, J. Eichler, and B. Shaanan (2006)
Bioinformatics 22, 671-675
   Abstract »    Full Text »    PDF »
From sequences to a functional unit.
A. Sivakumar, C. Wilton, and L. Holm (2006)
Physiol Genomics 25, 1-8
   Abstract »    Full Text »    PDF »
Computational inference and experimental validation of the nitrogen assimilation regulatory network in cyanobacterium Synechococcus sp. WH 8102.
Z. Su, F. Mao, P. Dam, H. Wu, V. Olman, I. T. Paulsen, B. Palenik, and Y. Xu (2006)
Nucleic Acids Res. 34, 1050-1065
   Abstract »    Full Text »    PDF »
Unique Transcriptome Signature of Mycobacterium tuberculosis in Pulmonary Tuberculosis.
H. Rachman, M. Strong, T. Ulrichs, L. Grode, J. Schuchhardt, H. Mollenkopf, G. A. Kosmiadi, D. Eisenberg, and S. H. E. Kaufmann (2006)
Infect. Immun. 74, 1233-1242
   Abstract »    Full Text »    PDF »
Hierarchical clustering algorithm for comprehensive orthologous-domain classification in multiple genomes.
I. Uchiyama (2006)
Nucleic Acids Res. 34, 647-658
   Abstract »    Full Text »    PDF »
MaGe: a microbial genome annotation system supported by synteny results.
D. Vallenet, L. Labarre, Z. Rouy, V. Barbe, S. Bocs, S. Cruveiller, A. Lajus, G. Pascal, C. Scarpelli, and C. Medigue (2006)
Nucleic Acids Res. 34, 53-65
   Abstract »    Full Text »    PDF »
Comparative analysis of proteins with a mucus-binding domain found exclusively in lactic acid bacteria.
J. Boekhorst, Q. Helmer, M. Kleerebezem, and R. J. Siezen (2006)
Microbiology 152, 273-280
   Abstract »    Full Text »    PDF »
Prediction of yeast protein-protein interaction network: insights from the Gene Ontology and annotations..
X. Wu, L. Zhu, J. Guo, D.-Y. Zhang, and K. Lin (2006)
Nucleic Acids Res. 34, 2137-2150
   Abstract »    Full Text »    PDF »
Transcription-mediated gene fusion in the human genome.
P. Akiva, A. Toporik, S. Edelheit, Y. Peretz, A. Diber, R. Shemesh, A. Novik, and R. Sorek (2006)
Genome Res. 16, 30-36
   Abstract »    Full Text »    PDF »
Tandem chimerism as a means to increase protein complexity in the human genome.
G. Parra, A. Reymond, N. Dabbouseh, E. T. Dermitzakis, R. Castelo, T. M. Thomson, S. E. Antonarakis, and R. Guigo (2006)
Genome Res. 16, 37-44
   Abstract »    Full Text »    PDF »
Prediction of protein-protein interactions using random decision forest framework.
X.-W. Chen and M. Liu (2005)
Bioinformatics 21, 4394-4400
   Abstract »    Full Text »    PDF »
Syntons, metabolons and interactons: an exact graph-theoretical approach for exploring neighbourhood between genomic and functional data.
F. Boyer, A. Morgat, L. Labarre, J. Pothier, and A. Viari (2005)
Bioinformatics 21, 4209-4215
   Abstract »    Full Text »    PDF »
Kernel-based machine learning protocol for predicting DNA-binding proteins.
N. Bhardwaj, R. E. Langlois, G. Zhao, and H. Lu (2005)
Nucleic Acids Res. 33, 6486-6493
   Abstract »    Full Text »    PDF »
Correlation between sequence conservation and the genomic context after gene duplication.
R. A. Notebaart, M. A. Huynen, B. Teusink, R. J. Siezen, and B. Snel (2005)
Nucleic Acids Res. 33, 6164-6171
   Abstract »    Full Text »    PDF »
Interactome: gateway into systems biology.
M. E. Cusick, N. Klitgord, M. Vidal, and D. E. Hill (2005)
Hum. Mol. Genet. 14, R171-R181
   Abstract »    Full Text »    PDF »
CoGenT++: an extensive and extensible data environment for computational genomics.
L. Goldovsky, P. Janssen, D. Ahren, B. Audit, I. Cases, N. Darzentas, A. J. Enright, N. Lopez-Bigas, J. M. Peregrin-Alvarez, M. Smith, et al. (2005)
Bioinformatics 21, 3806-3810
   Abstract »    Full Text »    PDF »
Role-similarity based functional prediction in networked systems: application to the yeast proteome.
P. Holme and M. Huss (2005)
J R Soc Interface 2, 327-333
   Abstract »    Full Text »    PDF »
Refined phylogenetic profiles method for predicting protein-protein interactions.
J. Sun, J. Xu, Z. Liu, Q. Liu, A. Zhao, T. Shi, and Y. Li (2005)
Bioinformatics 21, 3409-3415
   Abstract »    Full Text »    PDF »
Prediction of protein-protein interactions using distant conservation of sequence patterns and structure relationships.
J. Espadaler, O. Romero-Isart, R. M. Jackson, and B. Oliva (2005)
Bioinformatics 21, 3360-3368
   Abstract »    Full Text »    PDF »
Comparative interactomics analysis of protein family interaction networks using PSIMAP (protein structural interactome map).
D. Park, S. Lee, D. Bolser, M. Schroeder, M. Lappe, D. Oh, and J. Bhak (2005)
Bioinformatics 21, 3234-3240
   Abstract »    Full Text »    PDF »
Assessing the limits of genomic data integration for predicting protein networks.
L. J. Lu, Y. Xia, A. Paccanaro, H. Yu, and M. Gerstein (2005)
Genome Res. 15, 945-953
   Abstract »    Full Text »    PDF »
PRODOC: a resource for the comparison of tethered protein domain architectures with in-built information on remotely related domain families.
O. Krishnadev, N. Rekha, S. B. Pandit, S. Abhiman, S. Mohanty, L. S. Swapna, S. Gore, and N. Srinivasan (2005)
Nucleic Acids Res. 33, W126-W129
   Abstract »    Full Text »    PDF »
Prediction of protein-protein interactions by combining structure and sequence conservation in protein interfaces.
A. S. Aytuna, A. Gursoy, and O. Keskin (2005)
Bioinformatics 21, 2850-2855
   Abstract »    Full Text »    PDF »
A Multidomain Fusion Protein in Listeria monocytogenes Catalyzes the Two Primary Activities for Glutathione Biosynthesis.
S. Gopal, I. Borovok, A. Ofer, M. Yanku, G. Cohen, W. Goebel, J. Kreft, and Y. Aharonowitz (2005)
J. Bacteriol. 187, 3839-3847
   Abstract »    Full Text »    PDF »
Detecting remotely related proteins by their interactions and sequence similarity.
J. Espadaler, R. Aragues, N. Eswar, M. A. Marti-Renom, E. Querol, F. X. Aviles, A. Sali, and B. Oliva (2005)
PNAS 102, 7151-7156
   Abstract »    Full Text »    PDF »
Protein function prediction using the Protein Link EXplorer (PLEX).
S. V. Date and E. M. Marcotte (2005)
Bioinformatics 21, 2558-2559
   Abstract »    Full Text »    PDF »
The THAP domain of THAP1 is a large C2CH module with zinc-dependent sequence-specific DNA-binding activity.
T. Clouaire, M. Roussigne, V. Ecochard, C. Mathe, F. Amalric, and J.-P. Girard (2005)
PNAS 102, 6907-6912
   Abstract »    Full Text »    PDF »
Nebulon: a system for the inference of functional relationships of gene products from the rearrangement of predicted operons.
S. C. Janga, J. Collado-Vides, and G. Moreno-Hagelsieb (2005)
Nucleic Acids Res. 33, 2521-2530
   Abstract »    Full Text »    PDF »
Architecture of basic building blocks in protein and domain structural interaction networks.
H. S. Moon, J. Bhak, K. H. Lee, and D. Lee (2005)
Bioinformatics 21, 1479-1486
   Abstract »    Full Text »    PDF »
Experimental and bioinformatic approaches for interrogating protein-protein interactions to determine protein function.
A. Droit, G. G Poirier, and J. M Hunter (2005)
J. Mol. Endocrinol. 34, 263-280
   Abstract »    Full Text »    PDF »
Practical lessons from protein structure prediction.
K. Ginalski, N. V. Grishin, A. Godzik, and L. Rychlewski (2005)
Nucleic Acids Res. 33, 1874-1891
   Abstract »    Full Text »    PDF »
Relative predicted protein levels of functionally associated proteins are conserved across organisms.
G. Lithwick and H. Margalit (2005)
Nucleic Acids Res. 33, 1051-1057
   Abstract »    Full Text »    PDF »
Making connections between novel transcription factors and their DNA motifs.
K. Tan, L. A. McCue, and G. D. Stormo (2005)
Genome Res. 15, 312-320
   Abstract »    Full Text »    PDF »
Predicting protein functions with message passing algorithms.
M. Leone and A. Pagnani (2005)
Bioinformatics 21, 239-247
   Abstract »    Full Text »    PDF »



To Advertise     Find Products


Science. ISSN 0036-8075 (print), 1095-9203 (online)