In early 2019, a bug in group FaceTime calls would have let attackers activate the microphone, and even the camera, of the iPhone they were calling and eavesdrop before the recipient did anything at all. The implications were so severe that Apple invoked a nuclear option, cutting off access to the group-calling feature entirely until the company could issue a fix. The vulnerability—and the fact that it required no taps or clicks at all on the part of the victim—captivated Natalie Silvanovich.
“The idea that you could find a bug where the impact is, you can cause a call to be answered without any interaction—that's surprising,” says Silvanovich, a researcher in Google's Project Zero bug-hunting team. "I went on a bit of a tear and tried to find these vulnerabilities in other applications. And I ended up finding quite a few.”
Silvanovich has spent years studying “interaction-less” vulnerabilities, hacks that don't require their targets to click a malicious link, download an attachment, enter a password in the wrong place, or participate in any way. Those attacks have taken on increasing significance as targeted mobile surveillance explodes around the world.
At the Black Hat security conference in Las Vegas on Thursday, Silvanovich is presenting her findings about remote eavesdropping bugs in ubiquitous communication apps like Signal, Google Duo, and Facebook Messenger, as well as popular international platforms JioChat and Viettel Mocha. All of the bugs have been patched, and Silvanovich says that the developers were extremely responsive about fixing the vulnerabilities within days or a few weeks of her disclosures. But the sheer number of discoveries in mainstream services underscores how common these flaws can be and the need for developers to take them seriously.
“When I heard about that group FaceTime bug I thought it was a unique bug that would never occur again, but that turned out not to be true,” says Silvanovich. “This is something we didn’t know about before, but it’s important now for the people who make communication apps to be aware. You're making a promise to your users that you’re not going to suddenly start transmitting audio or video of them at any time, and it’s your burden to make sure that your application lives up to that.”
The vulnerabilities Silvanovich found offered an assortment of eavesdropping options. The Facebook Messenger bug could have allowed an attacker to listen in on audio from a target's device. The Viettel Mocha and JioChat bugs both potentially gave advanced access to audio and video. The Signal flaw exposed audio only. And the Google Duo vulnerability gave video access, but only for a few seconds. During this time an attacker could still record a few frames or grab screenshots.
The apps Silvanovich looked at all build much of their audio and video calling infrastructure on real-time communication tools from the open source project WebRTC. Some of the interaction-less calling vulnerabilities stemmed from developers who seemingly misunderstood WebRTC features, or implemented them poorly. But Silvanovich says that other flaws came from design decisions specific to each service related to when and how it sets up calls.
When someone calls you on an internet-based communication app, the system can start setting up the connection between your devices right away, a process known as “establishment,” so the call can start instantly when you hit accept. Another option is for the app to hang back a bit, wait to see if you accept the call, and then take a couple of seconds to establish the communication channel once it knows your preference.
The latter is easier to implement privately in the sense that there's less that can go wrong; the connection is established only after your affirmative consent. Silvanovich says, for example, that she didn't find any interaction-less calling bugs in Telegram, because the app takes that slower and slightly clunkier approach. Most mainstream services take the other route, though, setting up the communication channel and even starting to send data like audio and video streams in advance to offer a near-instantaneous connection should the call's recipient pick up.
Doing that prep work doesn't inherently introduce vulnerabilities, and it can be done in a privacy-preserving way. But it does create more opportunities for mistakes. The key is to design a system that's vetted to work the way it's intended to.
“With video conferencing, there are some big guarantees developers are making,” Silvanovich says. “That you’re not going to suddenly start transmitting video at any time, for example. Or does the Mute button really work? A reason a lot of these bugs happened is, people who designed these systems didn’t think about the promises they were making in terms of when audio and video are actually being transmitted and verify that they were being kept.”
Silvanovich adds that similar bugs likely remain undiscovered in mainstream communication apps. She looked only at one-to-one calling, for example, and the iOS group FaceTime vulnerability indicates that group calling may have its own slate of flaws. And she emphasizes that while brief audio or video snippets may not be a guaranteed gold mine for attackers in all cases, interaction-less attacks are often worth trying, because they appear innocuous and are difficult to trace.
“I find interaction-less bugs to be the most interesting class of vulnerabilities just because they’re so useful to attackers,” Silvanovich says. “If a user doesn’t have to do anything, that's the easiest thing."