This Special Advertisising Section is brought to you by AAAS OPMS

Drug Discovery and Biotechnology Trends - Proteomics 3: Probing Proteins' Structures
Knowledge of proteins' structures provides key clues to the ways in which living organisms work. Several evolving tools and tech- nologies give scientists the means to determine those structures.
by Peter Gwynne and Gary Heebner


ADVERTISERS

Affymetrix
DNA microarrays, based on the principles of semiconductor technology
408-731-5000
www.affymetrix.com

Bio-Rad Laboratories
instruments and reagents for life science research including genomics and proteomics
510-741-1000
www.expressionproteomics.com

Ciphergen Biosystems, Inc.
protein microarray systems, and related products that discover, characterize, and assay proteins from native biological samples
510-505-2100
www.ciphergen.com

Genetix [UK]
laboratory automation products and services for microarrays and sequencing
+44 1425 624 600
www.genetix.com

Genetix [USA]
877-436-3849

MWG [Germany]
products and services for genomic research including microarrays, oligonucleotides, sequencing, and laboratory automation
+49 8092 8289-0
www.the-mwg.com

MWG [USA
336-812-9995

SANYO Sales & Marketing Corporation / SANYO Electric Biomedical Co., Ltd.
constant temperature equipment (incubators, ovens, refrigerators, and freezers)
www.sanyo-biomedical.co.jp

Takara Bio, Inc.
kits and reagents for molecular biology research
+81 77 543 9254 www.takara-bio.co.jp/english

IN THIS ISSUE:
Protein purification
Protein Expression
Mutagenesis
Mass spectrometry
X-ray crystallography
Bioinformatics for protein studies
Computer hardware, software, and storage
Automation for drug discovery
The companies in this article were selected at random. Their inclusion in this article does not indicate endorsement by either AAAS or Science, nor is it meant to imply that their products or services are superior to those of other companies. This is the third of four supplements this year on proteomics. The first two appeared in the 30 April and 4 June issues of Science and the next will appear in the 24 September issue. "Modern molecular bioscience is built on two pillars: techniques of genetic engineering and protein structure determination," says Sol Gruner, professor of physics at Cornell University. The significance of genetic information in understanding the basis of life is obvious. But trying to make sense of that information without an understanding of the proteins that provide the link between genes and cells is, in Gruner's words, "akin to trying to put together the parts of a car without any instructions."

Irene Gabashvili, technical lead for computational biosciences at Hewlett-Packard, extends that thought. "The genome is not enough," she says. "So we need to know the structure of the proteins that are the participants in all the processes in the cell. We need to know how they look, how they interact, and what capabilities they have. We can't do anything if we don't know the structure."

"Understanding protein structures is the only way we can really understand how things work," agrees Dave Hicks, senior director of the proteomics business group at Applied Biosystems. "Most life science investigators and biologists now need to carry out some type of protein analysis, including many that involve determining some level of protein structures."

Mary Buchanan, marketing director of Stratagene, points to a more practical reason for the effort. "We need to know proteins' structures," she says, "to design drugs against them."

Research teams in universities and pharmaceutical companies have certainly accelerated the pace of their work on the structures of proteins. "The number of depositions in the protein data bank has grown at what appears to be an exponential rate since 1989," Gruner says. "You used to get a Ph.D. for a single protein structure; now you can walk into the lab and leave later in the day with a structure."

Different Beast
Determining structures remains difficult, however. Why? "Proteins are way more complex than nucleic acids," says Mark Roskey, vice president of marketing worldwide at Caliper Life Sciences. "They have different charges, heterogeneities, and polarities, and secondary and tertiary structures." Those structures, which result from proteins' adopting different three-dimensional formats, cause particular complications. "Just because you have the sequence of a protein, you don't necessarily understand how it folds on itself," points out Steve Lee, director of R&D at Hitachi Genetics/MiraiBio. Bruce Jarvis, senior scientist at Epicentre Technologies, highlights an added complication. "Every protein is a different beast," he explains. Victor Fursey, director of marketing and sales for North America at Bruker Daltonics, notes the overall effect of those complications. "You need more tools and a bit more patience than when you study genes," he says.

The most common tools for determining and validating proteins' structures are mass spectrometry, X-ray diffraction, solution nuclear magnetic resonance techniques, and electron microscopy. "Various forms of spectroscopy will also remain important," Hicks says. "And basic biochemical tools - for example, epitope mapping and wet biochemistry - are also required." Agrees Gabashvili: "There's a whole range of biophysical techniques that can give you approximate answers on how proteins interact and are arranged."

Beyond those methods, understanding protein sequences and folding demands powerful computers as well as analytical techniques. "And it always starts with getting the protein purified," Roskey says. To that end, Caliper has recently introduced a microfluidic product, LabChip 90, that replaces the traditional - and laborious - SDS polyacrylamide gel electrophoresis (SDS-PAGE) method of purification. "Rather than training someone to do SDS-PAGE, you can automate your purification process, purify the protein directly into microtiter places, load, and then walk away to do other things more valuable," Roskey says.

Modes of Expression
Next comes expression. Scientists can express recombinant proteins in several different systems, including bacterial and mammalian cells and even cell-free systems.

Bacterial expression systems are popular for proteins because they permit the rapid expression of proteins and are easy to use. However, mammalian cells offer various advantages over bacteria. Proteins from eukaryotic organisms produced in mammalian or other eukaryotic systems are more likely to be functional as the processes of transcription, translation, and posttranscriptional modifications are conserved. They can use plasmids, retroviruses, or adenoviruses to transfer genes into host cells. Cell-free protein expression systems, meanwhile, have their own advantages, such as easy introduction of labels into proteins, the possibility of expressing toxic or apoptotic proteins, and compatibility with studies that require co-expression of proteins.

The choice of expression system frequently involves a compromise. "Bacterial is often a researcher's first choice because it's fast, easy, and cheap," says Buchanan. "But it's so nonmammalian. You have problems of nonsolubility. And if the protein needs some posttranslational processing, you can't get it. In that case you have to move up the ladder, perhaps to an insect before a mammalian system. But with mammalian systems you have very low abundance and it's much more expensive to get a lot of proteins. Cell-free expression systems are more controlled and several get high yields. They are also easier to automate and to use for high throughput. But they are much more expensive than bacterial systems and you don't get posttranslation."

ATCC, Invitrogen, and Qbiogene are among the companies that offer protein expression systems. Stratagene has recently launched a series of bacterial expression vectors with three different tag options. "We have the vectors in all different varieties of tags," Buchanan says. "It gives scientists a chance to stay in bacterial expression systems by increasing the solubility, and gives them options for purification and quantitation as well."

Mutation Methods
Biological research also benefits from technology designed to introduce specific mutations into a DNA sequence. "Mutagenesis helps not so much in determining structure but in verifying it," says Merriann Carey, product manager at Epicentre.

Researchers use site-directed mutagenesis procedures to analyze individual amino acid residues in both protein sequences and specific protein-nucleic acid interactions. Similarly, serial deletion and random insertion protocols can facilitate studies of proteins' structures and analyses of promoters. Finnzymes, GE Healthcare (formerly Amersham Biosciences), New England Biolabs, and Promega offer kits for mutagenesis studies.

Epicentre has introduced what it calls its EZ::TN and HyperMu Transposon tools for research that involves mutagenesis. They use DNA sequences called transposons that can hop into other DNA molecules in a reaction catalyzed by a transposase enzyme. "Our EZ::TN Linker Insertion Kit is a transposon based technology that allows a researcher to insert, at random, 15 codons into the gene of a protein," Carey explains. "Scientists can use it to verify active sites. it's also useful in protein engineering; you can make insertions to see if certain parts of a protein are permissive."

The company will soon introduce a kit that allows scientists to make deletions at both the carboxyl and amino terminal ends of the genes of proteins and then express the truncated proteins. "We think that will be very useful for protein engineering and for making vaccines," Carey says.

Several Flavors
Scientists can use several flavors of mass spectrometry to identify individual proteins. For example, mass spectrometry (MS) combined with matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) provides a common means of analyzing one- and two-dimensional gel profiles. Researchers can compare the data gathered from the digest of a single protein spot with the properties of known proteins in the database and make an exact match or identification for an individual protein. Companies that manufacture these instruments include Applied Biosystems, Bruker Daltonics, Thermo Electron, and Waters.

Bruker Daltonics' range also includes quadrupole mass spectrometric technology (QMS) and Fourier transform mass spectrometry (FTMS). "Low abundant proteins are the challenging ones," Fursey says. "We use the quadrupole to enrich them. We also have an FTMS technology with a QQ front end and a MALDI-TOF approach; with the right methodologies you can attach intact proteins." The company has also launched what it calls the Apex-Q Hybrid Qq-FTMS. This features very high magnetic field strengths that allow scientists to probe large protein structures directly using several fragmentation methods.

Applied Biosystems also offers a range of mass spectrometers, including MALDI-TOF, MALDI-TOF-TOF, MALDI-Q-TOF, and high performance MS-MS systems. In addition, the company has partnered with Millipore Corporation to develop its new MALDIspot kit. "It integrates reagents and sample preparation technologies with instrumentation and software and an understanding of how they work," Hicks explains. "The higher throughput configuration helps purify multiple samples directly onto the plate over a range of concentrations. This facilitates the processing of large numbers of samples and enhances the data that scientists obtain to allow for more sensitive analysis."

Focusing on Crystals
Whereas mass spectrometry typically starts with proteins broken down into their peptide components to determine or validate their structures, X-ray crystallography measures structures directly by bombarding single crystals of unknown proteins with X-rays and interpreting the diffraction patterns that result using specialized hardware and software. Crystallography therefore provides reliable answers to many structure related questions, from protein folds to details of atomic bonding. And the technique has no limitation on the size of the molecule or complex to be studied.

Theoretically, scientists can use any source of X-rays to determine proteins' structures. But for the past 15 years they have favored those created by synchrotrons - devices that accelerate electrons in storage rings to produce intense beams of radiation at wavelengths ranging from those of visible light to low-frequency X-rays. Gruner, who is director of the Cornell High Energy Synchrotron Source (CHESS) as well as a physics professor, enumerates the advances that led to synchrotrons' dominance. "First," he says, "we had to have the synchrotron sources. Second was a way of preserving protein crystals so that they would survive the experiment - a cryoloop freezing method in which the protein is frozen to circumvent serious radiation damage issues. And third was the use of charge-coupled device detectors."

CHESS, the only one of the five major American synchrotron sources outside a national laboratory, does more than determine proteins' structures. "We also do a lot of in-house research to improve techniques of protein crystallography," Gruner says. "we're trying to put in place a micrographic capability aimed at membrane proteins, which have crystals unsuited to crystallography; even if they crystallize, they're very small and, because they tend to be unidirectional, difficult to resolve using the usual rotational method."

Since modern synchrotron sources have just about reached the limit of brilliance, physicists are seeking new means of generating synchrotron radiation. Gruner's team has proposed what it calls an energy recovery linac (ERL), a small linear accelerator that could produce radiation three times as brilliant as that from current devices. "The added brilliance would add coherence to the beam that would open up microscopy for seeing inside cells," Gruner says. The group hopes to obtain funding from the National Science Foundation to build a prototype of the device within three to five years.

Capable Software
Whatever technology life scientists use, their studies of the 3-D structures and functions of proteins will generate huge amounts of information. That demands software capable of storing large volumes of data, comparing the data, and recognizing relationships between sequences for the same organism or even for a group of organisms. "Researchers are struggling with the need to access all their information rapidly," comments Lee of Hitachi Genetics/MiraiBio. Crystal Impact, Bio-Rad Laboratories, and other companies offer specialized software packages for data management and analysis of protein structures.

Membership Reception in Stockholm
In conjunction with this summer's EuroScience Open Forum in Stockholm, the American Association for the Advancement of Science (AAAS) and Science will host a reception for international AAAS members and supporters. The even will take place on the evening of 27 August at the Vasa Museum, home of the world's only surviving 17th-century ship.

AAAS members from across Europe will have the opportunity to meet the association's chief executive, Alan Lesner, and to sample Scandinavian culinary specialties, wine, and cocktails. All members who attend will receive a complimentary AAAS pin and other special gifts. E-mail aaasreception{at}science-int.co.uk for more information. Learn about the Vasa Museum by visiting www.vasamuseet.se/Vasamuseet/Om.aspx.

Not a member of AAAS? You can join now by visiting www.scienceonline.org and click on the Join AAAS banner. For up-to-date information about this event, go to:

Some researchers work with programmers to develop proprietary software for analyzing their proteomics data, while others have turned to companies that provide tools to make these comparisons. Companies that produce software for protein sequence data analysis include Accelrys, GE Healthcare, and Nonlinear Dynamics. "We have a variety of software tools, mainly based on our Luminex 100 platform," says Allan Minn, senior software engineer at Hitachi Genetics/MiraiBio. "Our MasterPlex QT software can analyze multiplexed data up to 100 analytes or more with virtual plates - something that can't be done by hand." A new product that the company calls DNASIS GeneIndex permits scientists to maintain and query 20 databases uploaded to the server site and analyze all the functional, genomic, proteomics, and structural data for multiple organisms. "It shortcuts efforts to go to database after database," Lee says. "With a single click you'll get all the relational information from multiple databases, bring up the associated references, and weigh the importance of certain keywords."

Informatics can also integrate instruments in laboratories that study proteins. Companies such as Applied Biosystems, Caliper, and Varian have developed not only many of the core instruments and systems used in automated proteomic research but also software tools for analysis. Other companies, including LION Bioscience and TurboWorx, offer solutions to managing and integrating large amounts of scientific data.

Computing Power
Not surprisingly, computing power has become an issue for studies of protein structures. High performance computing technology is essential for handling the biomathematics, biostatistics, and computational science that are crucial technologies for studying complex models of biological processes.

Life science solutions from Hewlett-Packard (HP) include computing, 3-D stereo visualization, and storage hardware, system software, tools, applications, and professional services. Its Alpha, PA-RISC, and Itanium 2-based systems have given the company a powerful presence in scalable 64-bit systems that support very large memory applications. In addition, HP offers IA-32, Intel Xenon, and AMD Opteron systems with 64-bit extensions, thin blade and cluster configurations that scale out very well for structural genomics and proteomics. However, the company emphasizes a focus on individual clients. "The problem in biology is that there is no one superior technique or vendor," Gabashvili says. "What is special about HP solutions is that they are highly customized. We try to determine which high performance architectures are best for the specific problem the customer is working on."

Scientists who work on drug discovery projects that require them to screen thousands of samples of potential drugs to determine their effects on selected targets have one more essential technical requirement: automation. Efforts to automate laboratory procedures have resulted in the development of automated work stations, robotic devices and, most recently, microfluidic devices and automated processing systems. Companies such as Caliper, Cepheid, Ciphergen, and Zyomyx have taken different approaches to automating various steps in protein analysis using microfluidics.

Caliper has focused on manipulating very small protein samples. Using the company's LabChips and accompanying instruments, researchers can downsize the volumes of samples and reagents, consolidate procedures into a single small chip, and speed up the processing of their many protein samples. The LabChip 3000 drug discovery system performs screening experiments in a serial, continuous flow fashion. "We put a lot of industrial design and usability features into the LabChip 3000," says Yvonne Linney, Caliper's director of marketing. "It is much more user friendly and reliable and it has a smaller footprint." Pharmaceutical companies such as Amgen, Eli Lilly and Company, and Millennium Pharmaceuticals are using Caliper's systems in their drug screening programs.

Further along the drug pipeline, mass spectrometric technology developed by Bruker Daltonics gives pharmas the opportunity to check for modifications in drugs. "We can see quality assurance applications to check drugs and their modifications before they are shipped out," Fursey says.

Vendors continue to modify current tools and technologies and to develop new ones to help advance scientists' understanding of protein structures and function. As the tools emerge, they will have significant impact on basic research, clinical diagnostics, and drug discovery.

Peter Gwynne (pgwynne767{at}aol.com) is a freelance science writer based on Cape Cod, Massachusetts, U.S.A. Gary Heebner (gheebner{at}cell-associates.com) is a marketing consultant with Cell Associates in St. Louis, Missouri, U.S.A
WEBLINKS
ADVERTISERS

Affymetrix
DNA microarrays, based on the principles of semiconductor technology
408-731-5000
www.affymetrix.com

Bio-Rad Laboratories
instruments and reagents for life science research including genomics and proteomics
510-741-1000
www.expressionproteomics.com

Ciphergen Biosystems, Inc.
protein microarray systems, and related products that discover, characterize, and assay proteins from native biological samples
510-505-2100
www.ciphergen.com

Genetix [UK]
laboratory automation products and services for microarrays and sequencing
+44 1425 624 600
www.genetix.com

Genetix [USA]
877-436-3849

MWG [Germany]
products and services for genomic research including microarrays, oligonucleotides, sequencing, and laboratory automation
+49 8092 8289-0
www.the-mwg.com

MWG [USA]
336-812-9995

SANYO Sales & Marketing Corporation / SANYO Electric Biomedical Co., Ltd.
constant temperature equipment (incubators, ovens, refrigerators, and freezers)
www.sanyo-biomedical.co.jp

Takara Bio, Inc.
kits and reagents for molecular biology research
+81 77 543 9254
www.takara-bio.co.jp/English

FEATURED COMPANIES

Accelrys - a subsidiary of Pharmacopeia
bioinformatics software
www.accelrys.com

American Type Culture Collection (ATCC)
protein expression and purification systems
www.atcc.org

Amgen, Inc.
biotechnology organization
www.amgen.com

Applied Biosystems
instruments and reagents for proteomics research
www.appliedbiosystems.com

Bio-Rad Laboratories
bioinformatics software
www.expressionproteomics.com

Bruker Daltonics, Inc.
mass spectrometry systems
www.bdal.com

Caliper Life Sciences
microfluidic devices
www.calipertech.com

Cepheid
microfluidic devices
www.cepheid.com

Ciphergen Biosystems, Inc.
instruments and arrays for proteomics research
www.ciphergen.com

Cornell University
university
www.cornell.edu

Crystal Impact
bioinformatics software
www.crystalimpact.com

Eli Lilly and Company
pharmaceutical organization
www.lilly.com

Epicentre Technologies
mutagenesis kits and reagents
www.epicentre.com

Finnzymes
mutagenesis kits and reagents
www.finnzymes.com

GE Healthcare (formerly Amersham Biosciences)
mutagenesis kits and reagents
www.amershambiosciences.com

Hewlett-Packard
computers and operating systems
www.hp.com

Hitachi Genetic Systems/MiraiBio
instruments and supplies for array fabrication
www.miraibio.com

Invitrogen Corporation
protein expression and purification systems
www.invitrogen.com

LION Bioscience AG [Germany]
software solutions for proteomics
www.lionbioscience.com

Millennium Pharmaceuticals, Inc.
biotechnology organization
www.mlnm.com

Millipore Corporation
MS sample preparation
www.millipore.com

National Science Foundation
government agency which promotes the progress of science
www.nsf.gov

New England Biolabs, Inc.
mutagenesis kits and reagents
www.neb.com

Nonlinear Dynamics, Ltd. [UK]
bioinformatics software
www.nonlinear.com

Promega Corporation
mutagenesis kits and reagents
www.promega.com

Qbiogene, Inc.
protein expression and purification systems
www.qbiogene.com

Stratagene
protein purification kits and reagents
www.stratagene.com

Thermo Electron Corporation
mass spectrometry systems
www.thermo.com

TurboWorx, Inc.
software solutions for proteomics
www.turboworx.com

Varian, Inc.
bioinformatics software
www.varianinc.com

Waters Corporation
mass spectrometry systems
www.waters.com

Zyomyx, Inc.
microfluidic devices
www.zyomyx.com

Note: Readers can find out more about the companies and organizations listed by accessing their sites on the World Wide Web (WWW). If the listed organization does not have a site on the WWW or if it is under construction, we have substituted its main telephone number. Every effort has been made to ensure the accuracy of this information. The companies and organizations in this article were selected at random. Their inclusion in this article does not indicate endorsement by either AAAS or Science nor is it meant to imply that their products or services are superior to those of other companies.

This article was published
as a special advertising section
in the 30 July 2004 issue of Science