When the soccer World Cup kicks off this Thursday, imagine watching it as a 3D “hologram” on your kitchen table. That may not be far off thanks to a new technique that turns YouTube videos into 3D reconstructions of matches.
The key to the approach is a convolutional neural network—a type of artificial intelligence algorithm loosely modeled on the part of the brain that processes visual data—that researchers trained to estimate how far the surfaces of each player are from the camera that recorded the match. The network analyzed 12,000 2D images of players extracted from the soccer videogame FIFA alongside the corresponding 3D data from the game engine to learn how the two correlate. That allowed it to estimate depth maps for players from unseen 2D images. When shown unseen videos, the system accurately predicted depth maps for each player and combined them with the color footage to reconstruct each player in 3D. The players were then superimposed on a virtual soccer pitch allowing the match to be viewed in any 3D content viewer from a variety of angles.
The researchers tested the approach with Microsoft’s HoloLens smart glasses, which let them overlay the 3D reconstruction onto a real-world tabletop. The end product is still glitchy, it can’t recreate the ball, doesn’t work in real time, and only permits viewpoints from the side of the pitch the video was recorded. But the technique could be more scalable than leading approaches for reconstructing sports in 3D, which require arrays of cameras around the pitch recording every angle. The researchers say the approach should also work for other events that happen in predefined arenas, such as music concerts or theaters.