GlaxoSmithKline (GSK) is creating a data-focused culture and a global machine-learning team.
GlaxoSmithKline’s (GSK’s) data-first approach to drug discovery and development comes directly from chief executive officer (CEO) Emma Walmsley and chief scientific officer (CSO) Hal Barron. Their goal is doubling the chance of successful medicines being produced by using genetically validated targets. And that demands a strong team in artificial intelligence and machine learning (AI/ML).
Building and leading that team is Kim Branson, senior vice president and global head of the AI/ML group. Branson initially thought that GSK would be “quite ossified,” but as he discovered more, he found “a revolution.” At the highest level, he says, “leadership understands what ML is, how it’s done, and how to invest in it.”
GSK is changing more than its R&D strategy; it’s changing its culture.
In the long term, R&D at GSK starts with the idea that the combination of human genetics, functional genomics, and AI/ML can transform drug discovery and generate more genetically validated targets.
For human genetics, GSK signed a 2018 deal with 23andMe. With more than 10 million customers, the direct-to-consumer genetics company has the largest database of genetic and phenotypic information that could lead to new drug targets. GSK and 23andMe will analyze the aggregate, de-identified data on the 80% of customers who consented to contribute to research. Other GSK collaborations include the UK Biobank, which is now collecting whole-genome sequences on 500,000 participants.
For functional genomics, GSK and the University of California at Berkeley and at San Francisco launched the Laboratory for Genomics Research to use CRISPR methods to study human disease genes. Researchers will probe cell responses to altered gene expression and investigate gene interactions.
From those two sources, Branson says, “we’ll generate more data than GSK has created in its entire history.” That means AI/ML is crucial for using the data to help find candidate targets and medicines that meaningfully impact disease.
If you’re interested in the intersection where engineering meets biology, deep learning, and functional genomics data, this is a unique place to be.
The AI/ML team—about 45 people and growing—is based in Heidelberg, London, San Francisco, Boston, and Philadelphia. All team members have a strong AI/ML background and expertise in a specific type of data, such as medical imaging or sequencing.
“We operate as a research group,” Branson says, adding that the team will post resources on arXiv and publish results in peer-reviewed journals.
About 5 years ago, GSK began cleaning and organizing existing data into a custom platform. Structured, quantitative data include chemical properties of drug candidates and results from clinical trials, transcriptomics, and proteomics. Unstructured data include notes and reports, for which the AI/ML team is building natural language models. Data from medical imaging, such as MRIs, have great potential and challenges. The AI/ML group now works on integrating these multimodal data from external and internal sources into modular neural network models, Branson says.
The main work, Branson says, is “integration of functional genomics, proteomics, and transcriptomics data from large industrial-scale experiments to build new models.” Other AI/ML work might include investigating drug properties, such as toxicity and solubility, to suggest compound design and using ML with clinical data, such as pathology images, to predict responses to drugs. Manufacturing is another possible ML use. “It costs a lot to make complicated biological drugs,” Branson says. “If we increase yields, we can lower costs.”
Branson advocates for an overall culture change in the R&D team of 10,000 people. The goal is getting people to plan their work to collect as much data as possible, including recording experimental details. Branson envisions a virtuous cycle: As people use ML, they’ll be motivated to collect more and better data that advances ML.
For the growing AI/ML team, GSK needs people with computer science, mathematics, and AI/ML experience who are also experts in a field of data and have strong communication skills. Those people are rare, so Branson says education and training is also a core feature of the work. “If you’re interested in the intersection where engineering meets biology, deep learning, and functional genomics data,” he says, “this is a unique place to be.”
And GSK is creating a new fellowship program to continue that training. It provides outstanding early-career researchers in machine learning an opportunity to advance their career by applying machine learning to medicine, vaccine, and drug discovery. Fellows will explore how machine learning can crack some of the hardest problems in medicine and have access to GSK’s world-class data and compute infrastructure as well as huge proprietary datasets. GSK’s AI effort puts a particular focus on graph neural nets, causal-based methods, and reinforcement learning. Fellows have an unusual opportunity to develop novel methodology, publish at major conferences, and receive mentoring by our experienced senior team members. The first intake of GSK.AI Fellows will join the team in GSK’s Pancras Square office in Spring 2020. More information can be found at gsk.ai.
GSK ranks among the top employers in Science Careers’ 2019 Top Employer survey. Read more