Government and academic scientists differ on how best to dose rat pups with BPA for toxicity testing.


In BPA safety war, a battle over evidence

In the 1930s, a U.K. biochemist made a curious observation that today remains at the center of a raging debate about chemical safety. He noticed that the synthetic chemical bisphenol A (BPA) weakly mimics the human hormone estrogen. In the decades that followed, BPA became a ubiquitous ingredient in epoxy resins and polycarbonate plastic, used by the millions of tons every year for everything from dental sealants to plastic water bottles. But BPA doesn't stay put. In the 1990s, Stanford University researchers realized that tiny amounts can leach out of plastic. Researchers and the public, already worried that other hormone-mimicking chemicals could be interfering with the endocrine system—the symphony of hormones in the human body—wondered whether those traces of BPA were doing any harm.

By now tests have found the chemical in more than 90% of Americans. But the risks of BPA contamination are still in dispute. One reason: Studies have produced conflicting or inconclusive results, in part because alterations in the endocrine system can be subtle and hard to pin down. Another is a deep rift between academic scientists and regulators about which kinds of studies are best for shaping government oversight of chemicals.

In 2014, the U.S. Food and Drug Administration (FDA) vetted 161 new studies about the potential health effects of BPA. The goal was to see whether science could make a definitive judgment about the compound's safety and light a clear path for regulators. The compilation of evidence included reams of papers published in peer-reviewed journals, many of which found evidence suggesting tiny amounts of BPA could tinker with the human body. Yet agency scientists judged most of the studies to be useless for setting policy. The number they found to be compelling enough to help determine a safe dose of BPA: four. None reported an effect from small doses.

The divide highlighted by those results extends far beyond the BPA fight to disputes surrounding regulations over chemicals ranging from flame retardants to pesticides. On one side is research performed at many universities. On the other are so-called guideline studies, done using rules called "good laboratory practice" (GLP), which emerged in the 1970s and '80s and shape most chemical research in industry and government labs. The approach, which includes standards for data collection, record keeping, and acceptable kinds of tests, was born of a push to improve chemical safety studies. Two of the four studies that passed muster in the BPA review, for example, were guideline studies. But critics say the rules and standards have become barriers to using the most advanced science and weighing all the evidence available.

"The scientific apparatus in the world is higher than it has ever been in human history, and yet most of that is being ignored when it comes to public health protection from environmental chemicals," says Tom Zoeller, a research endocrinologist at the University of Massachusetts (UMass) in Amherst, who has studied how BPA interacts with the thyroid system.

Today, some scientists and regulators are wrestling with how to bridge this divide. One of the most ambitious efforts is a $30 million initiative headed by the National Institute of Environmental Health Sciences (NIEHS) in Durham, North Carolina. It's getting academic and government scientists to merge their methods, in hopes of shedding more light on the potential risks of BPA and providing a new model for assessing chemical safety. But the bumps jostling this project illustrate just how hard that might be.

Rooted in scandal

The roots of the present impasse can be traced, in part, to 1976 and a rodent-filled laboratory in a Chicago, Illinois, suburb. That's when federal officials started investigating Industrial Bio-Test Laboratories, after an FDA pathologist became suspicious when test reports appeared too good to be true. At the time, Bio-Test was one of the nation's largest private chemical testing labs, used by industries trying to satisfy federal safety standards.

The ensuing scandal triggered congressional hearings and criminal convictions for three of the company's top scientists, amid tales of gruesome lab conditions and falsified data. It also spurred the U.S. government to adopt GLP. The rules spell out extensive supervision, inspection, and record-keeping requirements aimed at ensuring lab procedures are followed and data aren't fudged.

For private and government laboratories working in the regulatory arena, GLP has become de rigueur. But few academic researchers use it, relying instead on the scrutiny their work gets from granting agencies, peer-reviewed journals, and university committees. "The paperwork burden is extremely cumbersome," says Jodi Flaws, a reproductive toxicologist at the University of Illinois in Urbana, who is part of the massive BPA study. "I feel like most academic researchers are honest and keep good records to be able to repeat their experiments. But I think GLP reporting is just a whole different level."

As chemical regulations matured in the 1980s, these lab rules were paired with new standards spelling out what kinds of tests regulators wanted to see to gauge a particular health effect, such as whether a chemical causes cancer. Today, there are hundreds of these guideline tests, most of them maintained by the Organisation for Economic Co-operation and Development (OECD), a Paris-based alliance of 35 countries created in the wake of World War II.

Initially, guideline studies were meant for screening, to make sure chemicals passed a minimum safety threshold before they were allowed into the market, says Chris Portier, former head of the National Toxicology Program, part of NIEHS. But over time, he says, regulators have come to treat guideline studies as a more definitive statement that a chemical is safe, and more reliable than less-scripted academic studies. "[The regulators] feel that the academic studies are not done as well. They're more likely to have bias or mistakes or misinterpretation. So they tend to shy away," Portier says.

The result is that when regulators evaluate a chemical, academic research can wind up getting second billing to industry and government studies, says Tracey Woodruff, a former senior scientist at the Environmental Protection Agency (EPA) who is now at the University of California, San Francisco (UCSF). For example, European chemical regulators rank the value of studies using a "Klimisch score." Its criteria include whether a study was done using GLP. Non-GLP studies are unlikely to earn the highest score.

Industries also pressure regulators to disregard research outside the boundaries of GLP guideline studies. In November 2016, for example, the Washington, D.C.-based pesticide industry group CropLife America petitioned EPA not to restrict a class of pesticides based on human epidemiological studies, in particular work by Columbia University scientists. That research found neurological problems in children with higher prenatal exposure to the pesticide chlorpyrifos. In the petition, CropLife warned against using studies that "do not meet well-defined data quality standards" and said that unlike traditional lab studies, EPA hasn't spelled out criteria for using epidemiological research.

Bob Diderich, head of the OECD division that drafts study guidelines, says regulators should weigh evidence even if it doesn't come from guideline studies. But he says it can be hard to figure out what to do with academic research using cutting-edge techniques such as toxicogenomics, which teases out changing patterns in gene activity due to toxic chemicals. "You have difficulties in judging: What does it mean in the real world? How does it translate to effects in humans and in animals?" Diderich says.

Baby bottles are now free of BPA, but it is still present in other food containers and packaging.

Angela Hampton Picture Library/Alamy stock photo

Screening the science

FDA's handling of the science around tiny doses of BPA illustrates the gulf between these two kinds of studies.

In 2008, the agency declared that the amount of BPA people typically ingest from food poses no health risks. It relied chiefly on two GLP guideline studies that found no evidence of harm from low doses. Both were done by a private lab with funding from the plastics industry. Since then, FDA has reviewed emerging research several times. It has now acknowledged there is some concern about BPA's effect on the brain and behavior, and on the prostate gland, but it hasn't changed its overall assessment that the dose people typically get from food doesn't pose a risk. And it continues to label many academic studies as flawed or not usable for setting regulatory standards.

Jason Aungst, an FDA toxicologist in College Park, Maryland, who helped lead those reviews, says that although academic studies can show "great flexibility," they are often more focused on deciphering the mechanism of how the chemical influences an organism, not on gauging how toxic a chemical is. "Usually when we see guideline studies, those are studies that have been validated, tested multiple times between multiple laboratories, and have produced reproducible results that we can have confidence in using in setting a safety level," Aungst adds.

But some academic researchers say the guideline studies rely on outdated testing methods. They point to a growing body of research finding effects from small doses of BPA on everything from anxiety to diabetes. Heather Patisaul, a neurotoxicologist at North Carolina State University in Raleigh, for example, has been scrutinizing the nervous systems of rats exposed to tiny amounts of BPA before they were born. By looking at messenger RNA levels in the brain, she found that BPA-exposed rats showed signs of more abundant estrogen receptors in the hypothalamus and amygdala, structures that can influence reproduction and behavior.

In contrast, a 2010 guideline study looked for neurological problems by weighing rodent brains, examining them under a microscope, and using tests such as laying an animal on its back and seeing how long it took to get on its feet. It reported no evidence of low-dose effects. "These guideline studies were designed when we were looking for overt, organ-level toxic effects. Things like cancer, big tumors," Patisaul says. "The guideline studies are missing a lot of things that I think would reflect effects on endocrine function."

Bridging the divide

Linda Birnbaum, an endocrinologist who is head of NIEHS, wondered whether it was possible to cut through this confusion by getting academic and FDA scientists to collaborate.

"The 21st century studies and approaches are revealing effects about human biology and the effects of chemicals or drugs that are not evident when you do some of the guideline studies," Birnbaum recalls. So "we asked 12 different questions which were not traditionally asked in guideline studies." The $30 million project, known as the Consortium Linking Academic and Regulatory Insights on BPA Toxicity, or CLARITY-BPA, started in 2012. It spans 12 university labs and FDA's National Center for Toxicological Research in Jefferson, Arkansas. There, lab workers raised more than 3800 rats following GLP protocols and exposed part of the population to BPA.

Tissue samples from these GLP rats were coded so that the labs analyzing them could be blinded to BPA exposure and dose. The university labs then looked for a range of possible effects, including subtle chemical, anatomical, and genetic changes in reproductive organs and mammary glands, the immune and nervous systems, the thyroid, and the heart. The data then went back to FDA for decoding.

The findings are now trickling out. Several university scientists say they are seeing effects at low levels of BPA, similar to their own earlier studies. Patisaul's lab again found evidence that small doses influence estrogen receptors in the brain.

For Zoeller, the UMass scientist, CLARITY held the promise of finding a way to pass muster with regulators while still using advanced scientific techniques. "I think that the design of CLARITY is incredibly important," he says. But he and others say the reality has been sobering. They describe a culture clash that has simmered throughout the program. "They have this attitude that they're the FDA and what they do is correct and that's the end of the story," says Gail Prins, an endocrinologist at the University of Illinois in Chicago who studies BPA's impact on the prostate. "That rubbed a lot of people a little bit sore."

Early on, for example, FDA researchers insisted on feeding the rats BPA by sticking a tiny tube down their throats and into their stomachs every day, a technique known as gavage. That's a standard method in guideline studies, because it allows for exact control of doses. Some university scientists lobbied for the gentler measures used in their labs, such as training the animals to sip BPA-infused oil from a tiny straw. They argued the gavage could stress the animals, potentially skewing test results. Gavage won out.

Despite CLARITY's hiccups, Birnbaum says, "we think it could be a model" for future studies. "Say for a high-volume compound of major industrial significance, it might be worth doing this kind of approach again." And Birnbaum hopes it may ultimately deliver a clearer verdict of BPA's risks, through the academic work and a new, large-scale guideline study of the same rats, which FDA plans to release in 2018. Because the study is part of CLARITY, university scientists were involved in the study design from the beginning.

An emerging shift

Outside FDA, some are now looking to the world of clinical medicine for a model of how to better incorporate cutting-edge research like Patisaul's into regulatory science. Known as systematic review, the approach pairs a panel of experts with a step-by-step guide for assessing scientific evidence on a subject. Conceived in the 1990s to study medical questions like the best way to treat heart attacks, systematic review rates each study based on quality, the strength of the evidence, and potential sources of bias. UCSF's Woodruff is leading a project to adapt the system for environmental chemicals. The work, which was partly funded by EPA, would allow regulators to set aside GLP and guideline standards and, she says, consider each study on its merits.

Some government agencies are beginning to adopt the strategy. Under Birnbaum, the National Toxicology Program’s Office of Health Assessment and Translation recently developed a systematic review approach. The agency used it in 2016 to conclude that the nonstick coating ingredient perfluorooctanoic acid and the stain repellent perfluorooctane sulfonate were harmful to people’s immune systems.

Environmental and chemical industry groups are both lobbying EPA to use systematic reviews as it examines chemicals under the recently amended chemical safety law, the Toxic Substances Control Act. Changes made to the act in 2016 require the agency to screen at least 20 high-risk chemicals in the next 3.5 years and scrutinize any new chemicals before letting them on the market.

EPA hasn't settled on its approach yet. In a proposed rule issued shortly before the departure of the Obama administration, the agency signaled that it wasn't planning a major shift from previous methods of weighing the evidence on safety. But with interest groups speaking up and the arrival of a new administration averse to environmental regulations, the struggle over how to implement the law—and how to scrutinize the science—will be intense.

"This is a big topic area," Woodruff says. "Millions and probably billions of dollars are at stake, and people's lives, so it really matters how the evidence is evaluated."