Mirrors help you see objects outside of your line of sight, whether it’s a car passing you on the highway or an unfortunate rash on your face. And as it turns out, with some extra computer processing, almost any old shiny object can serve as a decent mirror. In new research, computer scientists at the University of Washington have exploited the light reflected from the metallic lining of a bag of snacks to create a relatively faithful reconstruction of its surroundings.
“Remarkably, images of the shiny bag of chips contain sufficient clues to be able to reconstruct a detailed image of the room, including the layout of lights, windows, and even objects outside that are visible through windows,” coauthors Jeong Joon Park, Aleksander Holynski, and Steve Seitz of the University of Washington wrote in a paper that has been accepted to this year’s Conference on Computer Vision and Pattern Recognition proceedings. Their research helps to resolve a technical hurdle for virtual and augmented reality technology, although some experts say the scope of its potential uses—and abuses—is much larger.
Technically speaking, the researchers didn’t actually use chips; they reconstructed a room using a Korean brand of chocolate-dipped corn puffs called Corn Cho. But whether it’s corn puffs or potato chips, the snack bag acts like a bad, warped mirror. A heavily-distorted reflection of the room is contained in the glint of light that bounces off the bag, and the team developed an algorithm that unwarps that glint into a blurry but recognizable image.
In one instance, the researchers were able to resolve the silhouette of a man standing in front of a window. In another, the bag reflections allowed them to see through a window to the house across the street clearly enough to count how many stories it had. The algorithm works on a variety of glossy objects—the shinier, the better. Using the sheen of a porcelain cat, for example, they could also reconstruct the layout of the surrounding ceiling lights.
Generally, images of shiny objects tend to befuddle computers. For example, the glare often makes it difficult for computers to identify the object accurately. “What’s really interesting is that they didn’t see reflections as a corruption of the image,” says artificial intelligence researcher Deborah Raji of the AI Now Institute at New York University, who was not involved in the research. “They asked: ‘What can we see in the reflection?’”
To reconstruct the environment, the researchers used a handheld color video camera with a depth sensor that roughly detects the shape and distance of the shiny objects. They filmed these objects for about a minute, capturing their reflections from a variety of perspectives. Then, they used a machine learning algorithm to reconstruct the surroundings, which took on the order of two hours per object. Their reconstructions are remarkably accurate considering the relatively small amount of data that they used to train the algorithm, says computer scientist Abe Davis of Cornell University, who was not involved with the work.
The researchers could achieve this accuracy with so little training data, in part, because they incorporate some physics concepts in their reconstruction algorithm—the difference between how light bounces off shiny surfaces versus matte surfaces, for example. This differs from typical online image recognition tools in use today, which simply look for patterns in images without any extra scientific information. However, researchers have also found that too much physics in an algorithm can cause the machine to make more mistakes, as its processing strategies become too rigid. “They do a good job of balancing physical insights with modern machine learning tools,” says Davis.
The environment reconstruction, however, was merely one task in a larger project. The researchers’ ultimate goal was to generate new 3D perspectives of the chip bag: to have their computer accurately predict the bag’s appearance from all 360 degrees. Creating realistic views of a shiny object is a big challenge among AR and VR researchers. The glare patterns of a chip bag, for example, morph dramatically when you view it from different angles in a brightly-lit room. Because it’s difficult to make a computer reproduce these changing patterns, virtual shiny objects often look distorted and flattened—not very realistic. But the University of Washington researchers found that, by first reconstructing a shiny object’s environment, they could make more realistic views of the objects.
“I’m very interested in reconstructing the 3D world,” says lead author Park, a graduate student at the University of Washington. “By that, I mean copying the room you’re in and putting it in a virtual world, so that later you can interact with it in a realistic way.” He mentions future uses in VR gaming, for example. More realistic virtual perspectives could also benefit furniture companies like IKEA, which already offers an AR app called IKEA Place that allows you to virtually insert their products into the rooms of your house.
However, some experts caution that future versions of the technology are ripe for abuse. For example, it could enable stalkers or child abusers, says ethicist Jacob Metcalf of Data & Society, a nonprofit research center that focuses on the social implications of emerging technologies. A stalker could download images off of Instagram without the creators’ consent, and if those images contained shiny surfaces, they could deploy the algorithm to try to reconstruct their surroundings and infer private information about that person. “You better believe that there are a lot of people who will use a Python package to scrape photos off Instagram,” says Metcalf. “They could find a photo of a celebrity or of a kid that has a reflective surface and try to do something.”
Park points out that Instagram images don’t contain 3D depth information, which his algorithm needs in order to work. In addition, he says that his team considered potential misuse, particularly privacy violations such as surveillance, although they do not discuss these ethical considerations explicitly in the version of the paper currently available. Parks says that image and video platforms like YouTube could, in the future, automatically detect reflective surfaces in videos and then blur or process the image to keep the reconstruction algorithm from working. “Future research could enable privacy-preserving cameras or software that limits what can be inferred about the environment from reflections,” Park wrote in an email to WIRED. He also says that the algorithm is not currently accurate enough to pose a threat.
Metcalf thinks Park and his co-authors should state these ethical considerations directly in the paper. In fact, he thinks that the data science community as a whole needs to consistently include ethics sections in their publications. “I want to be clear; this isn’t a criticism of these researchers specifically, but of the norms of data science,” says Metcalf. “The norms of data science as an academic discipline have not yet grappled with the fact that papers like this have potentially enormous impact on people’s wellbeing.”
These ethical discussions can influence the direction of future research in the field, says Raji. “Some researchers will be like, ‘It doesn’t mean anything if I state what my intent is with the research; people are going to do what they’re going to do,’” she says. “But what they don’t realize is that the ethical statements often shape the development of the field itself.”
In an email response to WIRED, Park wrote that the team will include an ethics section in the official version of the paper released in association with the conference, which is scheduled to take place in June.
Park’s team isn’t the first to realize that snack packaging can be used as sensors. In 2014, Davis and his colleagues demonstrated that you could use a bag of chips as a microphone. They played a MIDI file of “Mary Had A Little Lamb” at the chip bag, and by processing a high-speed video of the bag’s vibrations, they could play the song back.
“There’s a surprising amount of information in images of everyday objects that are just kind of sitting there,” says Davis. With the right algorithms, it seems, any faint rustle or glint of light can now tell a tale.