Read our COVID-19 research and news.

The portion of the sugar chain on the right fits well with the experimental data (blue mesh) and is in the expected conformation, whereas the one on the left shows a distorted conformation and poor fit to the data.

The portion of the sugar chain on the right fits well with the experimental data (blue mesh) and is in the expected conformation, whereas the one on the left shows a distorted conformation and poor fit to the data.

Jon Agirre

Misleading sugar structures produce bitter result for protein sleuths

Structure determines function. It’s the canon among biologists who seek to understand a protein’s role in the body by mapping the positions of its atoms and so deducing the molecule’s three-dimensional shape. But many proteins come coated with sugar chains that alter their structures, influencing the way they interact with one another, bind to receptors, and even act as medicines. Now, a new analysis offers the disturbing suggestion that most of the publicly deposited structures of these sugar appendages are wrong. The inaccuracies not only clog up databases of protein maps, but also misinform biologists and drug designers, who use the structural information to design novel molecular therapies.

The notion that sugar structures are difficult to pin down isn’t new, says Carolyn Bertozzi, a glycobiologist at Stanford University in Palo Alto, California, who was not involved in the study. Sugar molecules’ small size and flexibility can make it hard to identify their exact conformation. Still, the new work quantifies the scale of the problem and so “delivers an important message,” Bertozzi says. The incorrect structures, she adds, “reveal widespread ignorance of chemistry and glycobiology, or at least profound inattention to the [sugar] component” of protein structure.

Structural biologists determine the shapes of proteins and their attached sugars primarily with x-ray crystallography. The technique bombards crystals containing a vast number of identical proteins with short bursts of x-rays, and then tracks how those x-rays ricochet off. Different atoms cause x-rays to recoil in distinct ways. By charting the ricochet patterns, researchers can work out the most likely configuration of a protein. Still, x-ray data always have some blur to them, which makes it challenging to assign the exact position of all the atoms.

Structural biologists sharpen their images with computer modeling software that constrains where specific atoms can be positioned, based on well-understood factors such as the length of the bonds between atoms and the angles of those bonds. “This has been highly successful for the protein and, more recently, the nucleic acid components of crystal structures,” says Kevin Cowtan, a structural biologist at the University of York in the United Kingdom. However, “some of those restraints are missing or wrong for sugars, and thus many [sugar structures] have been built in unrealistic forms,” says Jon Agirre, Cowtan’s structural biology colleague at York.

Cowtan, Agirre, and their York colleagues report online this month in Nature Chemical Biology that this problem is widespread. The researchers examined the structures of the nearly 50,000 biologically relevant sugars in two major databases of protein structures, the Protein Data Bank (PDB) and PDB_REDO. Using a sugar-specialized structure modeling program developed by Agirre, they analyzed the raw x-ray data of each sugar deposited in the database to determine its expected shape and checked how well it matched the structure reported. They scored each result between 0 and 1 to reflect how well the results matched, a measure they refer to as a density correlation. According to Cowtan, above 0.9 indicates a very good fit, and confidence in the match drops off rapidly below that.

For their paper, the York researchers focused their analysis on one subgroup of sugars—called N-glycan-forming D-pyranosides—which are known to adopt a handful of specific and energetically favorable conformations, such as an upward- or downward-facing bowl. On the plus side, the York researchers found that 7.8% of the subgroup’s structures have good fits. The bad news: Sixty-four percent had a correlation to density of less than 0.8, reflecting what Agirre calls “a poor fit to the experimental data.” And there was worse news: “Twenty-five percent of the studied sugars are [reported to be] in energetically improbable conformations; these are most certainly wrong,” Agirre concludes.

He and his co-authors note that in the 1980s, the structural biology community faced a similar problem of prevalent mistakes in the positioning of atoms in amino acids and those amino acids in proteins. But the community solved it by coming together to improve their refinement software and developing a community-wide standard of computing structure positions. That hasn’t happened yet with sugars, Cowtan says. But the crucial role many sugars play in cellular communication and function is becoming clearer. So, he concludes, “the whole structural biology community should be worried about getting these correct.”