Genetic tests have revealed protein structures (red) with nearly the accuracy of x-ray structures (gray).

IMAGES: M. Stiffler et al./bioRxiv, https://doi.org/10.1101/667790

Mutant genes could supercharge efforts to decipher protein structures

Mapping the atomic structure of proteins is crucial to understanding how they behave, but it’s painstaking work that typically requires dedicated, expensive facilities with supercooled, powerful magnets or stadium-size synchrotrons. Now, two research teams independently report this week that they’ve hit on a way to use genetic and biochemical techniques to do the job, potentially opening structural biology to many more labs. And unlike traditional methods, which visualize proteins in crystals or solution, the approach can also reveal proteins’ natural shape inside cells as they do their work, which could uncover how mutations that disrupt protein function lead to disease.

“It’s fantastic,” says Douglas Fowler, a genome scientist at the University of Washington in Seattle. Fowler notes the new approach doesn’t offer the full atomic map that standard approaches do, but the general shape it provides for a protein nonetheless offers “extremely valuable” clues to its function. Plus, he says, “It could have a really big impact” on efforts to determine structures of proteins that are membrane-bound or part of large complexes, both of which are difficult to study with standard methods.

Today’s most common approach for determining a protein’s structure requires coaxing millions of copies of a protein into an ordered crystal, blasting it with x-rays, and tracking their ricochets to reveal the identity and position of each atom. Alternative approaches, including nuclear magnetic resonance spectroscopy and cryoelectron microscopy, also require large amounts of a protein and can take months. Researchers have unraveled the structures of only about 150,000 of the millions of proteins thought to exist—and the process has taken decades.

Some have tried to speed things up by predicting the most likely shape of a protein simply from its sequence of amino acids and probable interactions between atoms. The accuracy of such computational efforts typically lags behind experimental methods. One recent trick to improve matters has been to compare the same protein in multiple species to find pairs of amino acids that have evolved together even though they are far apart on the protein’s linear sequence. That’s a strong indication the two are close together and interact within the folded 3D molecule. But this approach works only if researchers can identify proteins that are shared by many organisms but different enough across them to identify multiple pairs of amino acids evolving in tandem, says Debora Marks, a systems biologist at Harvard Medical School in Boston.

Now, separate teams led by Marks and Ben Lehner, a geneticist at the Barcelona Institute of Science and Technology in Spain, have bypassed the need for evolution’s help. They independently hit on the idea of finding interacting amino acids within a protein by systematically mutating each amino acid and tracking how the changes alter the protein’s function, such as an ability to bind to another molecule. Both groups built on work on a bacterial protein fragment called GB1 by a team led by Ren Sun, a systems biologist at the University of California, Los Angeles. In 2014, Sun’s team reported creating more than half a million copies of the GB1 gene, each with one or two of its 56 amino acids changed. For the so-called single mutants, the researchers systematically swapped every amino acid for one of the 19 other options. In the double mutants, they changed pairs of amino acids, working through nearly all possible combinations. After growing bacteria with these mutant genes and isolating the proteins, Sun’s team determined the importance of GB1’s amino acids by seeing which mutants bound most tightly to its natural target, human immunoglobulin G antibodies.

Marks’s and Lehner’s groups realized they could combine the binding data of the single and double mutants to determine which amino acids interact most strongly and are therefore likely sit next to each other in the protein’s 3D structure. “Sometimes we see mutations that combine to have a much more dramatic effect,” Fowler says. By tracking dozens of such occurrences and feeding the results into a structure prediction program, the teams computed the shape of GB1’s main backbone to within a few angstroms of the resolution of the already known experimental x-ray structure.

The teams, who reported their success yesterday in Nature Genetics, also showed that their technique worked with other small proteins and an RNA with analogous available data. Fowler notes the same approach may be more difficult for proteins with hundreds or thousands of amino acids, because the number of mutant proteins that must be made increases exponentially as the proteins grow. But Marks is optimistic: Early indications suggest the technique can solve structures with only a fraction of all possible mutants. The approach could even work using measures of stability for proteins that lack known binding partners, Lehner’s team says.

The method has other strengths. In March, Lehner and his team posted a preprint on the bioRxiv server describing its use to learn how an RNA-binding protein called TDP-43 may cause the neurodegenerative disease amyotrophic lateral sclerosis (ALS). Insoluble aggregates of TDP-43 have been seen in neurons in the autopsied brains of many ALS patients, but the deposits may not cause the disease. So Lehner and his colleagues made 50,000 mutants of TDP-43 and tracked their toxicity in yeast cells. They found that mutant forms that aggregated were less toxic than other versions of the protein. “This is the exact opposite of what we expected,” Lehner says, cautioning that they need to confirm this result in mammalian cells. Either way, he says, it shows that scanning mutations by the thousands may offer new insights into both proteins themselves and how their structures affect health in living cells.