Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.


Science 5 October 2001:
Vol. 294. no. 5540, pp. 93 - 96
DOI: 10.1126/science.1065659

Viewpoint

Protein Structure Prediction and Structural Genomics

David Baker,1 Andrej Sali2

Genome sequencing projects are producing linear amino acid sequences, but full understanding of the biological role of these proteins will require knowledge of their structure and function. Although experimental structure determination methods are providing high-resolution structure information about a subset of the proteins, computational structure prediction methods will provide valuable information for the large fraction of sequences whose structures will not be determined experimentally. The first class of protein structure prediction methods, including threading and comparative modeling, rely on detectable similarity spanning most of the modeled sequence and at least one known structure. The second class of methods, de novo or ab initio methods, predict the structure from sequence alone, without relying on similarity at the fold level between the modeled sequence and any of the known structures. In this Viewpoint, we begin by describing the essential features of the methods, the accuracy of the models, and their application to the prediction and understanding of protein function, both for single proteins and on the scale of whole genomes. We then discuss the important role that protein structure prediction methods play in the growing worldwide effort in structural genomics.

1 Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA. E-mail: dabaker{at}u.washington.edu.
2 Laboratory of Molecular Biophysics, Pels Family Center for Biochemistry and Structural Biology, The Rockefeller University, New York, NY 10021, USA. E-mail: sali{at}rockefeller.edu.


Read the Full Text


THIS ARTICLE HAS BEEN CITED BY OTHER ARTICLES:
Generation of Amylosucrase Variants That Terminate Catalysis of Acceptor Elongation at the Di- or Trisaccharide Stage.
J. Schneider, C. Fricke, H. Overwin, B. Hofmann, and B. Hofer (2009)
Appl. Envir. Microbiol. 75, 7453-7460
   Abstract »    Full Text »    PDF »
Discovering rules for protein-ligand specificity using support vector inductive logic programming.
L. A. Kelley, P. J. Shrimpton, S. H. Muggleton, and M. J.E. Sternberg (2009)
Protein Eng. Des. Sel. 22, 561-567
   Abstract »    Full Text »    PDF »
INTREPID: a web server for prediction of functionally important residues by evolutionary analysis.
S. Sankararaman, B. Kolaczkowski, and K. Sjolander (2009)
Nucleic Acids Res. 37, W390-W395
   Abstract »    Full Text »    PDF »
@TOME-2: a new pipeline for comparative modeling of protein-ligand complexes.
J.-L. Pons and G. Labesse (2009)
Nucleic Acids Res. 37, W485-W491
   Abstract »    Full Text »    PDF »
A structural model for K2P potassium channels based on 23 pairs of interacting sites and continuum electrostatics.
A. Kollewe, A. Y. Lau, A. Sullivan, Benoit Roux, and S. A.N. Goldstein (2009)
J. Gen. Physiol. 134, 53-68
   Abstract »    Full Text »    PDF »
Coarse-grained modeling of large RNA molecules with knowledge-based potentials and structural filters.
M. A. Jonikas, R. J. Radmer, A. Laederach, R. Das, S. Pearlman, D. Herschlag, and R. B. Altman (2009)
RNA 15, 189-199
   Abstract »    Full Text »    PDF »
MODBASE, a database of annotated comparative protein structure models and associated resources.
U. Pieper, N. Eswar, B. M. Webb, D. Eramian, L. Kelly, D. T. Barkan, H. Carter, P. Mankoo, R. Karchin, M. A. Marti-Renom, et al. (2009)
Nucleic Acids Res. 37, D347-D354
   Abstract »    Full Text »    PDF »
Solvent dramatically affects protein structure refinement.
G. Chopra, C. M. Summa, and M. Levitt (2008)
PNAS 105, 20239-20244
   Abstract »    Full Text »    PDF »
Identification of Monomorphic and Divergent Haplotypes in the 2006-2007 Norovirus GII/4 Epidemic Population by Genomewide Tracing of Evolutionary History.
K. Motomura, T. Oka, M. Yokoyama, H. Nakamura, H. Mori, H. Ode, G. S. Hansman, K. Katayama, T. Kanda, T. Tanaka, et al. (2008)
J. Virol. 82, 11247-11262
   Abstract »    Full Text »    PDF »
Targeting the Human Cancer Pathway Protein Interaction Network by Structural Genomics.
Y. J. Huang, D. Hang, L. J. Lu, L. Tong, M. B. Gerstein, and G. T. Montelione (2008)
Mol. Cell. Proteomics 7, 2048-2060
   Abstract »    Full Text »    PDF »
HSEpred: predict half-sphere exposure from protein sequences.
J. Song, H. Tan, K. Takemoto, and T. Akutsu (2008)
Bioinformatics 24, 1489-1497
   Abstract »    Full Text »    PDF »
The Jpred 3 secondary structure prediction server.
C. Cole, J. D. Barber, and G. J. Barton (2008)
Nucleic Acids Res. 36, W197-W201
   Abstract »    Full Text »    PDF »
Protein model refinement using an optimized physics-based all-atom force field.
A. Jagielska, L. Wroblewska, and J. Skolnick (2008)
PNAS 105, 8268-8273
   Abstract »    Full Text »    PDF »
Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection.
T. Damoulas and M. A. Girolami (2008)
Bioinformatics 24, 1264-1270
   Abstract »    Full Text »    PDF »
Analysis of the Decarboxylation Step in Mammalian Histidine Decarboxylase: A COMPUTATIONAL STUDY.
A. A. Moya-Garcia, J. Ruiz-Pernia, S. Marti, F. Sanchez-Jimenez, and I. Tunon (2008)
J. Biol. Chem. 283, 12393-12401
   Abstract »    Full Text »    PDF »
A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation.
M. Brylinski and J. Skolnick (2008)
PNAS 105, 129-134
   Abstract »    Full Text »    PDF »
Assessment of the Roles of Serines 5.43(239) and 5.46(242) for Binding and Potency of Agonist Ligands at the Human Serotonin 5-HT2A Receptor.
M. R. Braden and D. E. Nichols (2007)
Mol. Pharmacol. 72, 1200-1209
   Abstract »    Full Text »    PDF »
Protein folding by zipping and assembly.
S. B. Ozkan, G. A. Wu, J. D. Chodera, and K. A. Dill (2007)
PNAS 104, 11987-11992
   Abstract »    Full Text »    PDF »
M4T: a comparative protein structure modeling server.
N. Fernandez-Fuentes, C. J. Madrid-Aliste, B. K. Rai, J. E. Fajardo, and A. Fiser (2007)
Nucleic Acids Res. 35, W363-W368
   Abstract »    Full Text »    PDF »
Pcons.net: protein structure prediction meta server.
B. Wallner, P. Larsson, and A. Elofsson (2007)
Nucleic Acids Res. 35, W369-W374
   Abstract »    Full Text »    PDF »
LOMETS: A local meta-threading-server for protein structure prediction.
S. Wu and Y. Zhang (2007)
Nucleic Acids Res. 35, 3375-3382
   Abstract »    Full Text »    PDF »
Consensus sequences improve PSI-BLAST through mimicking profile-profile alignments.
D. Przybylski and B. Rost (2007)
Nucleic Acids Res. 35, 2238-2246
   Abstract »    Full Text »    PDF »
Identification of amino acid residues of Salmonella SlyA that are critical for transcriptional regulation.
N. Okada, Y. Oi, M. Takeda-Shitaka, K. Kanou, H. Umeyama, T. Haneda, T. Miki, S. Hosoya, and H. Danbara (2007)
Microbiology 153, 548-560
   Abstract »    Full Text »    PDF »
Developing a move-set for protein model refinement.
M. N. Offman, P. W. Fitzjohn, and P. A. Bates (2006)
Bioinformatics 22, 1838-1845
   Abstract »    Full Text »    PDF »
Fibrinogen Seoul (FGG Ala341Asp): A Novel Mutation Associated With Hypodysfibrinogenemia.
K. S. Song, N. J. Park, J. R. Choi, H. J. Doh, and K. H. Chung (2006)
Clinical and Applied Thrombosis/Hemostasis 12, 338-343
   Abstract »    PDF »
kinDOCK: a tool for comparative docking of protein kinase ligands..
L. Martin, V. Catherinot, and G. Labesse (2006)
Nucleic Acids Res. 34, W325-W329
   Abstract »    Full Text »    PDF »
Fold Recognition of the Human Immunodeficiency Virus Type 1 V3 Loop and Flexibility of Its Crown Structure During the Course of Adaptation to a Host.
T. Watabe, H. Kishino, Y. Okuhara, and Y. Kitazoe (2006)
Genetics 172, 1385-1396
   Abstract »    Full Text »    PDF »
Comprehensive genome analysis of 203 genomes provides structural genomics with new insights into protein family space.
R. L. Marsden, D. Lee, M. Maibaum, C. Yeats, and C. A. Orengo (2006)
Nucleic Acids Res. 34, 1066-1080
   Abstract »    Full Text »    PDF »
The Impact of Structural Genomics: Expectations and Outcomes.
J.-M. Chandonia and S. E. Brenner (2006)
Science 311, 347-351
   Abstract »    Full Text »    PDF »
A supersecondary structure library and search algorithm for modeling loops in protein structures..
N. Fernandez-Fuentes, B. Oliva, and A. Fiser (2006)
Nucleic Acids Res. 34, 2085-2097
   Abstract »    Full Text »    PDF »
MODBASE: a database of annotated comparative protein structure models and associated resources.
U. Pieper, N. Eswar, F. P. Davis, H. Braberg, M. S. Madhusudhan, A. Rossi, M. Marti-Renom, R. Karchin, B. M. Webb, D. Eramian, et al. (2006)
Nucleic Acids Res. 34, D291-D295
   Abstract »    Full Text »    PDF »
Identification of slow correlated motions in proteins using residual dipolar and hydrogen-bond scalar couplings.
G. Bouvignies, P. Bernado, S. Meier, K. Cho, S. Grzesiek, R. Bruschweiler, and M. Blackledge (2005)
PNAS 102, 13885-13890
   Abstract »    Full Text »    PDF »
Use of Enrichment Culture for Directed Evolution of the Vibrio fluvialis JS17 {omega}-Transaminase, Which Is Resistant to Product Inhibition by Aliphatic Ketones.
H. Yun, B.-Y. Hwang, J.-H. Lee, and B.-G. Kim (2005)
Appl. Envir. Microbiol. 71, 4220-4224
   Abstract »    Full Text »    PDF »
Large-scale modelling as a route to multiple surface comparisons of the CCP module family.
D. C. Soares, D. L. Gerloff, N. R. Syme, A. F.W. Coulson, J. Parkinson, and P. N. Barlow (2005)
Protein Eng. Des. Sel. 18, 379-388
   Abstract »    Full Text »    PDF »
Assessing the limits of genomic data integration for predicting protein networks.
L. J. Lu, Y. Xia, A. Paccanaro, H. Yu, and M. Gerstein (2005)
Genome Res. 15, 945-953
   Abstract »    Full Text »    PDF »
Detecting remotely related proteins by their interactions and sequence similarity.
J. Espadaler, R. Aragues, N. Eswar, M. A. Marti-Renom, E. Querol, F. X. Aviles, A. Sali, and B. Oliva (2005)
PNAS 102, 7151-7156
   Abstract »    Full Text »    PDF »
Amino Acid 36 in the Human Immunodeficiency Virus Type 1 gp41 Ectodomain Controls Fusogenic Activity: Implications for the Molecular Mechanism of Viral Escape from a Fusion Inhibitor.
M. Kinomoto, M. Yokoyama, H. Sato, A. Kojima, T. Kurata, K. Ikuta, T. Sata, and K. Tokunaga (2005)
J. Virol. 79, 5996-6004
   Abstract »    Full Text »    PDF »
TM-align: a protein structure alignment algorithm based on the TM-score.
Y. Zhang and J. Skolnick (2005)
Nucleic Acids Res. 33, 2302-2309
   Abstract »    Full Text »    PDF »
Practical lessons from protein structure prediction.
K. Ginalski, N. V. Grishin, A. Godzik, and L. Rychlewski (2005)
Nucleic Acids Res. 33, 1874-1891
   Abstract »    Full Text »    PDF »
Protein structure and evolutionary history determine sequence space topology.
B. E. Shakhnovich, E. Deeds, C. Delisi, and E. Shakhnovich (2005)
Genome Res. 15, 385-392
   Abstract »    Full Text »    PDF »
Improving conformational searches by geometric screening.
M. Zhang, R. A. White, L. Wang, R. Goldman, L. Kavraki, and B. Hassett (2005)
Bioinformatics 21, 624-630
   Abstract »    Full Text »    PDF »
3D models of yeast RNase P/MRP proteins Rpp1p and Pop3p.
M. DLAKIC (2005)
RNA 11, 123-127
   Abstract »    Full Text »    PDF »
Protein sequence entropy is closely related to packing density and hydrophobicity.
H. Liao, W. Yeh, D. Chiang, R.L. Jernigan, and B. Lustig (2005)
Protein Eng. Des. Sel. 18, 59-64
   Abstract »    Full Text »    PDF »
Predicted hexameric structure of the Agrobacterium VirB4 C terminus suggests VirB4 acts as a docking site during type IV secretion.
R. Middleton, K. Sjolander, N. Krishnamurthy, J. Foley, and P. Zambryski (2005)
PNAS 102, 1685-1690
   Abstract »    Full Text »    PDF »
The protein structure prediction problem could be solved using the current PDB library.
Y. Zhang and J. Skolnick (2005)
PNAS 102, 1029-1034
   Abstract »    Full Text »    PDF »
Second-site Suppression of a Nonfunctional Mutation within the Leishmania donovani Inosine-Guanosine Transporter.
S. Arastu-Kapur, C. S. Arendt, T. Purnat, N. S. Carter, and B. Ullman (2005)
J. Biol. Chem. 280, 2213-2219
   Abstract »    Full Text »    PDF »
Improvement of comparative model accuracy by free-energy optimization along principal components of natural structural variation.
B. Qian, A. R. Ortiz, and D. Baker (2004)
PNAS 101, 15346-15351
   Abstract »    Full Text »    PDF »
High-Throughput Computational and Experimental Techniques in Structural Genomics.
M. R. Chance, A. Fiser, A. Sali, U. Pieper, N. Eswar, G. Xu, J. E. Fajardo, T. Radhakannan, and N. Marinkovic (2004)
Genome Res. 14, 2145-2154
   Abstract »    Full Text »    PDF »
Comparative structural modeling and inference of conserved protein classes in Drosophila seminal fluid.
J. L. Mueller, D. R. Ripoll, C. F. Aquadro, and M. F. Wolfner (2004)
PNAS 101, 13542-13547
   Abstract »    Full Text »    PDF »
iMolTalk: an interactive, internet-based protein structure analysis server.
A. V. Diemand and H. Scheib (2004)
Nucleic Acids Res. 32, W512-W516
   Abstract »    Full Text »    PDF »
Automated structure prediction of weakly homologous proteins on a genomic scale.
Y. Zhang and J. Skolnick (2004)
PNAS 101, 7594-7599
   Abstract »    Full Text »    PDF »
Protein structure prediction for the male-specific region of the human Y chromosome.
K. Ginalski, L. Rychlewski, D. Baker, and N. V. Grishin (2004)
PNAS 101, 2305-2310
   Abstract »    Full Text »    PDF »
Clp Protease Complexes from Photosynthetic and Non-photosynthetic Plastids and Mitochondria of Plants, Their Predicted Three-dimensional Structures, and Functional Implications.
J.-B. Peltier, D. R. Ripoll, G. Friso, A. Rudella, Y. Cai, J. Ytterberg, L. Giacomelli, J. Pillardy, and K. J. van Wijk (2004)
J. Biol. Chem. 279, 4768-4781
   Abstract »    Full Text »    PDF »
Protein structure prediction using sparse dipolar coupling data.
Y. Qu, J.-t. Guo, V. Olman, and Y. Xu (2004)
Nucleic Acids Res. 32, 551-561
   Abstract »    Full Text »    PDF »
MODBASE, a database of annotated comparative protein structure models, and associated resources.
U. Pieper, N. Eswar, H. Braberg, M. S. Madhusudhan, F. P. Davis, A. C. Stuart, N. Mirkovic, A. Rossi, M. A. Marti-Renom, A. Fiser, et al. (2004)
Nucleic Acids Res. 32, D217-222
   Abstract »    Full Text »    PDF »
Loops In Proteins (LIP)--a comprehensive loop database for homology modelling.
E. Michalsky, A. Goede, and R. Preissner (2003)
Protein Eng. Des. Sel. 16, 979-985
   Abstract »    Full Text »    PDF »
Protein secondary structure prediction based on an improved support vector machines approach.
H. Kim and H. Park (2003)
Protein Eng. Des. Sel. 16, 553-560
   Abstract »    Full Text »    PDF »
Comparative protein structure modeling by iterative alignment, model building and model assessment.
B. John and A. Sali (2003)
Nucleic Acids Res. 31, 3982-3992
   Abstract »    Full Text »    PDF »
Tools for comparative protein structure modeling and analysis.
N. Eswar, B. John, N. Mirkovic, A. Fiser, V. A. Ilyin, U. Pieper, A. C. Stuart, M. A. Marti-Renom, M. S. Madhusudhan, B. Yerkovich, et al. (2003)
Nucleic Acids Res. 31, 3375-3380
   Abstract »    Full Text »    PDF »
Alternatively spliced isoforms of the human constitutive androstane receptor.
S. S. Auerbach, R. Ramsden, M. A. Stoner, C. Verlinde, C. Hassett, and C. J. Omiecinski (2003)
Nucleic Acids Res. 31, 3194-3207
   Abstract »    Full Text »    PDF »
Folding Pathway Mediated by an Intramolecular Chaperone. A FUNCTIONAL PEPTIDE CHAPERONE DESIGNED USING SEQUENCE DATABASES.
Y. Yabuta, E. Subbian, C. Oiry, and U. Shinde (2003)
J. Biol. Chem. 278, 15246-15251
   Abstract »    Full Text »    PDF »
The 2002 Olympic Games of Protein Structure Prediction.
D. Fischer and L. Rychlewski (2003)
Protein Eng. Des. Sel. 16, 157-160
   Abstract »    Full Text »    PDF »
Conservation of structure and function among tyrosine recombinases: homology-based modeling of the lambda integrase core-binding domain.
B. M. Swalla, R. I. Gumport, and J. F. Gardner (2003)
Nucleic Acids Res. 31, 805-818
   Abstract »    Full Text »    PDF »
Enlarged FAMSBASE: protein 3D structure models of genome sequences for 41 species.
A. Yamaguchi, M. Iwadate, E.-i. Suzuki, K. Yura, S. Kawakita, H. Umeyama, and M. Go (2003)
Nucleic Acids Res. 31, 463-468
   Abstract »    Full Text »    PDF »
Folding free energy function selects native-like protein sequences in the core but not on the surface.
A. Jaramillo, L. Wernisch, S. Hery, and S. J. Wodak (2002)
PNAS 99, 13554-13559
   Abstract »    Full Text »    PDF »
Insights from a Three-Dimensional Model into Ligand Binding to Constitutive Active Receptor.
L. Xiao, X. Cui, V. Madison, R. E. White, and K.-C. Cheng (2002)
Drug Metab. Dispos. 30, 951-956
   Abstract »    Full Text »    PDF »
MODBASE, a database of annotated comparative protein structure models.
U. Pieper, N. Eswar, A. C. Stuart, V. A. Ilyin, and A. Sali (2002)
Nucleic Acids Res. 30, 255-259
   Abstract »    Full Text »    PDF »



To Advertise     Find Products


Science. ISSN 0036-8075 (print), 1095-9203 (online)