14.8 C
New York
Monday, May 20, 2024

An Absurdly Basic Bug Let Anyone Grab All of Parler's Data

The social media platform Parler rose to prominence as an outlet for free speech. In practice, it became a haven for disinformation, hate speech, and calls for violence, the sort of content generally blocked on more mainstream platforms like Twitter and Facebook. It's fair to say, though, that by “free speech” the site's creators didn't mean that anyone could freely download every message, photo, and video posted to the site, including sensitive geolocation data. But a very basic bug in Parler's architecture nonetheless seems to have made it all too easy to do just that.

Late Sunday night, Parler went offline after Amazon Web Services cut off hosting for the social media outlet, a decision that followed the site's use as a tool to plan and coordinate an insurrectionist, pro-Trump mob's invasion of the US Capitol building last week. In the days and hours before that shutdown, a group of hackers scrambled to download and archive the site, uploading dozens of terabytes of Parler data to the Internet Archive. One pseudonymous hacker who led the effort and goes only by the twitter handle @donk_enby told Gizmodo that the group had successfully archived "99 percent" of the site's public contents, which she said includes a trove of "very incriminating" evidence of who participated in the Capitol raid and how.

By Monday, rumors were circulating on Reddit and across social media that the mass disemboweling of Parler's data had been carried out by exploiting a security vulnerability in the site's two-factor authentication that allowed hackers to create "millions of accounts" with administrator privileges. The truth was far simpler: Parler lacked the most basic security measures that would have prevented the automated scraping of the site's data. It even ordered its posts by number in the site's URLs, so that anyone could have easily, programmatically downloaded the site's millions of posts.

Parler's cardinal security sin is known as an insecure direct object reference, says Kenneth White, codirector of the Open Crypto Audit Project, who looked at the code of the download tool @donk_enby posted online. An IDOR occurs when a hacker can simply guess the pattern an application uses to refer to its stored data. In this case, the posts on Parler were simply listed in chronological order: Increase a value in a Parler post url by one, and you'd get the next post that appeared on the site. Parler also doesn't require authentication to view public posts and doesn't use any sort of "rate limiting" that would cut off anyone accessing too many posts too quickly. Together with the IDOR issue, that meant that any hacker could write a simple script to reach out to Parler's web server and enumerate and download every message, photo, and video in the order they were posted.

"It's just a straight sequence, which is mind-numbing to me," says White. "This is like a Computer Science 101 bad homework assignment, the kind of stuff that you would do when you're first learning how web servers work. I wouldn't even call it a rookie mistake because, as a professional, you would never write something like this."

Services like Twitter, by contrast, randomize the URLs of posts so they can't be guessed. And while they offer APIs that give developers access to tweets en masse, they carefully restrict access to those APIs. By contrast, Parler had no authentication for an API that offered access to all its public contents, says Josh Rickard, a security engineer for security firm Swimlane. "Honestly it seemed like an oversight, or just laziness," says Rickard, who says he analyzed Parler's security architecture in a personal capacity. "They didn’t think about how big they were going to get, so they didn’t do this properly."

WIRED reached out to Parler for comment, but the company so far hasn't responded.

Despite Parler's security woes, @donk_enby was careful to counter rumors that hackers had accessed all Parler information, including the images of driver's licenses that Parler asks users to submit if they want a verified account. "Only things that were available publicly via the web were archived," @donk_enby wrote in a Twitter post. A Reddit rumor that hackers gained access to more private data on the site—due to SMS provider Twilio cutting ties with Parler and disabling its two-factor authentication—was "bullshit," @donk_enby confirmed in a message to WIRED. While Twilio did drop Parler as a customer, the result was only that hackers could bypass two-factor authentication if they knew an account's password or could mass-generate new accounts, she says. They could not gain access to existing accounts.

Even so, White points out that Parler appears to have failed to scrub geolocation metadata from images and videos before they were posted. So while the data that hackers have pulled from the site may be public, the result is that much of that archived content also contains Parler users' detailed locations, likely revealing the GPS coordinates of many of their homes. Data artist Kyle McDonald has already created a visualization of the locations of 68,000 of the archived Parler videos.

Twitter content

This content can also be viewed on the site it originates from.

"This is as bad as it gets," White says. "It's gross incompetence on the part of Parler. They marketed themselves as a private, secure, unmoderated platform, and instead it's comedy hour."

Despite being cut off from Amazon Web Services, the Google Play Store, and the Apple App Store, Parler has vowed to return: Company investor Dan Bongino told Fox News on Monday that the service would be online again "by the end of the week."

If and when Parler does return, White argues that it will need to take a hard look at its security engineering more broadly. Its bugs, he speculates, likely run deeper than the ability to download its public data en masse. "If you walk up to a car with duct tape on the bumper, puddles of oil underneath, and rust spots, you can make some reasonable assumptions about the state of the engine," White says. "If a Python script can archive your whole user content with simple web requests, then you've got a serious architecture problem."

Related Articles

Latest Articles