Between alternative transcription start sites, alternative splicing, and posttranslational modifications, a given gene may produce dozens of protein variants, each with a different biological activity. Teasing apart those structure-function relationships requires mapping specific variants to their associated biological functions, and the tool of the trade for doing so is mass spectrometry. But not just any mass spec will do. Researchers need a holistic view of protein structure, data that is lost with the popular “bottom-up” proteomics strategy. Powered by today’s ultrahigh-resolution, high mass-accuracy mass specs, protein biochemists are increasingly turning bottom-up upside-down. Their new alternative: top-down proteomics.
If you want to know which of a gene’s many variants, or “proteoforms,” is responsible for a particular biological activity, you need a way to detect that isoform directly. That’s easier said than done.
Proteoform analysis is fundamentally a two-part problem. The first part, protein identification, is a simple question of peptide sequencing: matching spectral peaks to a protein’s amino acid sequence and thence to the gene that encoded it. This can be complicated if related proteins are present in a sample, because they share identical stretches of amino acid sequences, but in general is relatively straightforward.
Tougher by far is characterization. A given protein may exist in dozens of forms distinguished by just a few daltons, variants that differ in terms of messenger RNA (mRNA) splicing, posttranslational proteolytic processing, and chemical modification. Take histones, for instance. Histone proteins can be heavily modified by methyl, acetyl, and phosphoryl groups, among others, at their N-termini, which in turn can impact chromatin structure and gene expression. In 2009, University of Pennsylvania Presidential Associate Professor Benjamin Garcia (then at Princeton University) used a so-called middle-down strategy—in which relatively large protein fragments (bigger than tryptic peptides but smaller than intact proteins) are analyzed and sequenced in the mass spectrometer—and some clever chromatography to resolve and identify 70 proteoforms of human histone H4 and 200 of human histone H3.2.
It isn’t clear that every one of those variants has a different biological activity, of course. But the only way to know is to accurately tally them and track their changes under different biological conditions. And therein lies the rub. In bottom-up proteomics, researchers digest their proteoforms to peptides, separate them via liquid chromatography, and deliver them to the mass spectrometer. But as it cleaves the peptide backbone, trypsin also destroys any chance researchers have of understanding how posttranslation modifications are linked. The enzyme can cleave the 15 kilodalton (kDA) histone H3.2 29 times, including more than a dozen sites in the critical N-terminal tail. Using a bottom-up strategy effectively destroys information on how those individual chemical modifications are related, meaning researchers may be able to see that given modifications are present, but are largely blind to their interplay and stoichiometry. They certainly wouldn’t be able to determine if, say, two modifications were coincident or mutually exclusive.
In the top-down approach, the histone proteoforms are delivered to the mass spec intact and then sequenced by fragmentation inside the instrument, thereby retaining the critical linkage data. This is a more technically challenging strategy, in that intact proteins are harder to fractionate and fragment than peptides, and much harder to separate by liquid chromatography. Furthermore, it takes relatively high-end instrumentation to resolve such large molecules when they are so similar in size, and special software to do the analysis. Lysine trimethylation, for instance, increases protein mass by 42.0470 Da, while acetylation adds 42.0106 Da, a difference of just 0.0364 Da.
Still, top-down is on the upswing, says Neil Kelleher, the Glass Professor of Life Sciences at Northwestern University, founding member of the Consortium for Top Down Proteomics and a top-down evangelist. At the recent annual meeting of the American Society for Mass Spectrometry (ASMS), for example, top-down accounted for “10% to 15%” of the conference, Kelleher says. “A decade ago, it was around 0.1%, or very fringe.”
Top of the line
One driver for the growth in top-down is the increasing availability of instrumentation capable of running the experiments. Given the need to distinguish proteins varying by only small chemical changes, top-down researchers typically use high-end, high-resolution instrumentation. Just a few years ago, that mostly meant top-of-the-line Fourier-transform ion-cyclotron resonance (FT-ICR) mass spectrometers, massive and complicated hardware offering resolution values—and pricing—in the millions. Today, more affordable quadrupole-time-of-flight (qTOF) instruments, such as the Waters SYNAPT G2-Si, the Bruker maXis II, and the Thermo Fisher Scientific benchtop Orbitrap mass spectrometers, have made the technology more accessible.
Still, for some jobs, only an FT-ICR will do. And one of the world’s most powerful such systems just went online at the Pacific Northwest National Laboratory (PNNL), in Washington state.
FT-ICR mass spectrometers derive their exquisite resolution from the massive cryo-cooled magnets that power them, and as magnetic field strength rises, so too does performance, says Ljiljana Paša-Tolić, lead scientist for mass spectrometry at PNNL’s Environmental Molecular Sciences Laboratory user facility. Thus, with a more powerful magnet, “you can think about getting higher resolving power in the same acquisition time, or you can get equal resolving power in a shorter acquisition time.”
The PNNL already has several FT-ICR instruments, including systems with 12 and 15 tesla (T) magnets. The new instrument, which went online in mid-March, clocks in at 21 T. With a linear ion trap (Thermo Scientific LTQ-Velos) on the front end and an Agilent Technologies magnet on the back, the instrument “occupies almost the whole room; it [weighs] about 24 tons,” Paša-Tolić says. The magnet itself requires about 4,000 L of liquid helium to maintain its working temperature of 2.19 Kelvin. (A second 21 T instrument, employing a Bruker magnet, has been installed at the National High Magnetic Field Laboratory in Tallahassee, Florida.)
The PNNL instrument went online in mid-March, Paša-Tolić says, and preliminary data were presented at the recent ASMS conference. “We were able to demonstrate resolving power of about 8 million for 12-second transients, which is great,” she says. That 12-second analysis time is too slow for the traditional LC-MS workflow, in which proteins flow straight from a liquid chromatography (LC) column into the mass spectrometer (MS), she notes. But even at a more LC-compatible rate, the instrument yields resolutions of about 1 million, she says, and further improvements are in the works. “We have demonstrated a resolving power … an order of magnitude greater than what is attainable with currently available commercial technology.”
Among other things, Paša-Tolić hopes to use the 21 T to break the size barrier that bedevils top-down research. Top-down researchers typically struggle to characterize proteins larger than about 50 kDa, though some have used the technique to tackle the posttranslational modifications of 150 kDa biotherapeutic antibodies. But with a more powerful magnet, it may be possible to routinely hit 100 kDa or more, Paša-Tolić says. Indeed, her lab already presented data at ASMS demonstrating “isotopically resolved” analysis of 70 kDa proteins (such as intact bovine serum albumin) at high spectral-acquisition rates.
Paša-Tolić now plans to direct the instrument at secreted fungal enzymes, especially those that degrade lignocellulose. These heavily glycosylated proteins, weighing between 50,000 and 100,000 Da, could advance biofuel development, and Paša-Tolić is developing new reverse-phase chromatography strategies to separate them. “It would be very beneficial to figure out how this pattern of glycosylation relates to function and stability and eventually glycoengineer these enzymes to be more stable and more commercially affordable,” she says.
Top-down proteomics is so named because intact proteins are separated and broken down into smaller and smaller pieces in the MS to determine their sequence and modifications. To do that, researchers can apply any of a number of protein fragmentation methods, and the more options available, the better. “You might want to have a lot of fragmentation tools available to really get the most out of the actual experiment,” says Andreas Huhmer, proteomics marketing director at Thermo Fisher Scientific. Thermo’s new Orbitrap Fusion Lumos, for instance, offers collision-induced dissociation (CID), in which the peptide backbone is broken by collision with a gas molecule, and the related higher-energy collisional dissociation (HCD). It also enables the popular electron-transfer dissociation, which uses a charged donor molecule to induce fragmentation, as well as hybrid methods, such as electron-transfer and higher-energy collision dissociation (EThcD).
Jennifer Brodbelt, the William H. Wade Endowed Professor of Chemistry at the University of Texas at Austin, is developing an alternative fragmentation approach. Ultraviolet photodissocation (UVPD) uses ultraviolet laser pulses to cause proteins to shatter along their backbone, producing a ladder of fragments that vary in size by a single amino acid. That’s how other fragmentation methods are supposed to work, too, but according to Brodbelt, most tend to fragment more efficiently at protein termini or near charged residues, providing incomplete sequence coverage. UVPD seems to provide relatively uniform coverage across the entire sequence, at least for proteins up to 40 kDa, including the oft-overlooked protein center. “The fragmentation process does not seem to be as charge-modulated as those other methods,” she says.
Brodbelt has worked with Thermo Fisher Scientific to implement the technique on Orbitrap instruments. In one recent paper, she applied the method on an Orbitrap Elite to map the linkages in branched poly-ubiquitin chains. The result was a remarkable series of fragment ions, one for each consecutive amino acid of the ubiquitin chain, terminating at the residue to which the ubiquitin moiety is coupled. By simply counting those ions and watching where they abruptly disappeared, she could determine where the inter-ubiquitin linkages must have occurred.
“You’d see a huge shift, a mass shift when ubiquitin appeared at a particular lysine,” Brodbelt explains.
Though still in development, UVPD systems have been installed in several labs. The PNNL 21 T has one. So does John Yates III, the Ernest W. Hahn Professor at the Scripps Research Institute, who has mounted the system on an Orbitrap Fusion. Bottom-up proteomics, Yates explains, has long been considered easier than top-down in part because the infrastructure required to do it—the mass spectrometers, the peptide separation methods, and the analytical software—was already mature when the technique was developed. The experiments themselves were thus easier to perform. “For top-down, almost everything has to be reinvented or certainly significantly improved in order to make this whole workflow possible.” That, he says, explains his enthusiasm for UVPD. “Hopefully it will get us the kind of fragmentation that we need in order to effectively analyze these things.”
From top-down to top-top-down
As top-down adoption grows, so too do the technical developments. One emerging area is what Kelleher calls “top-top-down,” or native mass spectrometry. The method allows researchers to examine multiprotein complexes in the MS, and one researcher making significant headway on this approach is Vicki Wysocki, Ohio Eminent Scholar at Ohio State University.
Existing fragmentation approaches, such as CID, simply cannot inject enough energy per collision into a protein complex to cause it to fall apart, Wysocki explains. “If you have a very large protein complex …[and] if you are colliding that into argon with a mass of 40, the amount of energy that you can transfer will be fairly small.” Rather than falling apart, a protein in such a complex will simply unfold, she says. So, her group developed an alternative approach, surface-induced dissociation (SID), in which complexes are smashed at high speed into an inert fluorocarbon-coated gold surface.
Using SID, Wysocki says, researchers can work out the topology of protein complexes and subcomplexes, teasing them apart to determine, for instance, which protein-protein interfaces are strong and which are weak. Suppose a given complex is a hexamer, she explains—a dimer of trimers. “We will directly see those trimers as products of the SID,” she says. In one recent example, Wysocki’s team used that approach to work out the stoichiometry of the Pyrococcus furiosus RNAse P complex, an RNA-containing tetramer whose structure was previously unresolved.
Waters has been working with Wysocki’s group to offer SID capability to selected investigators on the SYNAPT G2 series qTOFs, and Wysocki has received grant funding to implement the method on Orbitrap and FT-ICR instruments as well. She has also developed more elaborate implementations, including a modified qTOF containing two SID cells flanking Waters’ ion mobility separation unit, for performing multiple surface collision events. Ion mobility separation, Wysocki explains, “is sort of like a gas-phase electrophoresis,” separating ions by size and shape, and it “has been a huge help in all of this work.”
Another emerging development is top-down-based mass spectrometric imaging, Paša-Tolić says. Richard Caprioli at Vanderbilt University, and Ron Heeren in the Netherlands have both demonstrated laser ablation-based top-down strategies in the past year using FT-ICR mass analyzers, and Paša-Tolić says she would like to apply such strategies to study the soil rhizosphere, for instance, to determine where different secreted enzymes are located. “If you think about the way we do top-down proteomics right now, it clearly is missing spatial information,” Paša-Tolić says. “In many instances, this would be extremely useful to have.”
As for Kelleher, he sees a bright future for top-down in clinical research. Indeed, it is in the clinic that one of top-down’s biggest successes can already be found. The Bruker BioTyper, a simple matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometer for identification of bacterial pathogens based on intact protein masses, “has been a smash success …. Arguably one of the best successes of proteomics in clinical medicine,” Kelleher says. Now he hopes to apply that same top-down philosophy to clinical biomarker development for complex disease.
With a growing user community, he won’t be alone in that work. But Kelleher remains undaunted. “I’m smiling,” he says. “Even if people are telling me they’ve done it better than my group has, I just say, ‘okay, great, it’s a big sandbox. Come play!’”
Inclusion of companies in this article does not indicate endorsement by either AAAS or Science, nor is it meant to imply that their products or services are superior to those of other companies.
Newly offered instrumentation, apparatus, and laboratory materials of interest to researchers in all disciplines in academic, industrial, and governmental organizations are featured in this space. Emphasis is given to purpose, chief characteristics, and availability of products and materials. Endorsement by Science or AAAS of any products or materials mentioned is not implied. Additional information may be obtained from the manufacturer or supplier.