Imagine searching through your digital photos by mentally picturing the person or image you want. Or sketching a new kitchen design without lifting a pen. Or texting a loved one a sunset photo that was never captured on camera. A computer that can read your mind would find many uses in daily life, not to mention for those paralyzed and with no other way to communicate. Now, scientists have created the first algorithm of its kind to interpret—and accurately reproduce—images seen or imagined by another person. It might be decades before the technology is ready for practical use, but researchers are one step closer to building systems that could help us project our inner mind’s eye outward.
“I was impressed that it works so well,” says Zhongming Liu, a computer scientist at Purdue University in West Lafayette, Indiana, who helped develop an algorithm that can somewhat reproduce what moviegoers see when they’re watching a film. “This is really cool.”
Using algorithms to decode mental images isn’t new. Since 2011, researchers have recreated movie clips, photos, and even dream imagery by matching brain activity to activity recorded earlier when viewing images. But these methods all have their limits: Some deal only with narrow domains like face shape, and others can’t build an image from scratch—instead, they must select from preprogrammed images or categories like “person” or “bird.” This new work can generate recognizable images on the fly and even reproduce shapes that are not seen, but imagined.
To figure out what a person is seeing, researchers turned to functional magnetic resonance imaging (fMRI), which measures blood flow to the brain as a proxy for neural activity. They mapped out visual processing areas to a resolution of 2 millimeters as three people looked at more than 1000 images several times each. The goal was to consider just the activity in response to an image—like a leopard—and eventually have a computer paint an image that would produce nearly the same activity.
But instead of showing their subjects painting after painting until the computer got it right, the team built a software stand-in for the brain, a deep neural network (DNN) with several layers of simple processing elements. “We believe that a deep neural network is good proxy for the brain’s hierarchical processing,” says Yukiyasu Kamitani, a neuroscientist at Kyoto University in Japan, and the study’s senior author. “By using a DNN we can extract information from different levels of the brain’s visual system,” from simple light contrast up to more meaningful content such as faces.
Using a “decoder,” the researchers created representations of the brain’s responses to the images, but in the DNN. From then on, they no longer needed the actual fMRI measurements, just the DNN translations.
In guessing what someone was viewing, the translation acts as a template, and the fMRI data are set aside. The system then tries to paint a picture that will trigger the DNN to respond in a way that matches this template. It does so through trial and error until, hopefully, it paints the desired image, whether a leopard, a duck, or a stained-glass window. The system starts with something random—similar to TV static—and slowly refines its painting over the course of 200 rounds. To get closer to the ideal image, the system calculates the difference between the DNN activity and the templated DNN activity. Those calculations cause it to nudge one pixel this way and one pixel that way, until it gets closer to its ideal image.
To make the final product more accurate, researchers included a “deep generator network” (DGN), an algorithm that in this case has been pretrained to generate realistic images based on its input. The DGN refines the paintings to look more naturalistic. Once that was added, a neutral human observer could tell which of two photos an image was meant to recreate 99% of the time, the researchers reported in a paper uploaded to the preprint server bioRxiv late last month.
Next, the scientists tried to read the minds of people merely imagining images. This time they scanned the three subjects’ brains after asking them to recall images previously displayed, including a fish, an airplane, and simple colored shapes. The method didn’t work well for photos, but for the shapes, the generator created a recognizable image 83% of the time.
It’s “interesting and careful work,” says Nikolaus Kriegeskorte, a computational neuroscientist at Columbia University’s Zuckerman Institute. He wonders to what extent the inaccuracies in the computer-generated images are due to limitations in brain activity measurements and to what extent they reflect mistakes in how our brains interpret images. “Higher-resolution fMRI and other brain imaging techniques might further improve the results,” he says. With better measurements and continued improvement in algorithms, we might someday communicate through mental pictures.