Like many stories about people trying to help fix the internet, this one begins in the aftermath of 2016. From his home in Ireland, Conor Brady had watched the Brexit vote and the election of Donald Trump with disbelief. In his view, the prominence of false stories during each election—whether about Muslim immigrants or Hillary Clinton’s health—was the direct consequence of a hollowed-out news industry without the resources to check the spread of disinformation.
At the time, Conor’s son, Neil—also a former journalist—was working as a digital policy analyst at the Institute of International and European Affairs, researching neural networks and machine learning. The two got to thinking. Wouldn’t it be great, they wondered, if a machine-learning tool could approximate the wisdom of editors and lawyers in order to help overstretched newsrooms? As they thought about it, one use case seemed especially ripe: automated defamation detection. Libel lawsuits are a major threat to news organizations. A system that could flag potentially risky stories before publication could save serious time and money.
“I said to him, ‘Do you think an editor, a journalist, would use that if we could build that kind of tool?’” Neil Brady recalls. “And he said, ‘I’ve no bloody doubt they would.’ And that’s when we said, OK, let’s do it.”
CaliberAI is the startup that eventually launched from that conversation, with a €300,000 pre-seed grant from Enterprise Ireland, a government fund, in November 2020. The basic idea is to provide an extra, automated set of eyes to reporters and editors—like a warning system for potential libel. (Defamation lawsuits tend to be much easier to bring against publishers in Europe than in the US, where the First Amendment gives journalists extra protection.) But the long-term play is more ambitious. The European Union and the United Kingdom are both in the process of crafting laws that could impose new legal liability on platforms for harmful and illegal content, including defamation. In the US, Congress keeps making noises about reforming Section 230 of the Communications Decency Act, the legal shield that protects American companies from liability over user posts. Social media platforms around the world may soon be confronting a version of the legal liability that newspapers have long had to deal with. And their ability to handle it could depend on the success of tools like the one the Bradys are building.
Defamation plays an important but overlooked role in the history of the internet. In the US, Section 230 was originally passed, in 1996, to deal with the fallout from a libel lawsuit. Traditional media organizations, like newspapers and TV news shows, face harsh liability rules for publishing a defamatory claim—a false statement that harms someone’s reputation—or even just passing along a defamatory statement made by someone else. In the 1990s, a trial court ruled that the same standard should apply to online platforms that took steps to moderate user-generated content. This created a perverse incentive: Companies might have avoided moderating anything for fear of falling under the ruling, thereby hosting a complete free-for-all, or they might have chosen to moderate with excessive caution, stifling too many innocent posts in the process. And so Congress passed Section 230, establishing that platforms generally can’t be held liable for user posts no matter what.
A key part of the thinking behind Section 230 was that while a newspaper might publish a few dozen or a hundred stories a day, an internet platform might host thousands or millions or, eventually, billions of pieces of content uploaded by users. At that scale, it’s impossible to vet everything in the same way an editor or legal department might. While the major platforms today enlist thousands of moderators, they rely even more on automation to flag violations. And the challenge appears especially daunting for defamation. Whether a statement is defamatory depends on whether it’s true or false—a particularly tough judgment to automate. Unlike a list of prohibited words, the universe of potential defamatory posts is infinite.
The insight driving CaliberAI is that this universe is a bounded infinity. While AI moderation is nowhere close to being able to decisively rule on truth and falsity, it should be able to identify the subset of statements that could even potentially be defamatory.
Carl Vogel, a professor of computational linguistics at Trinity College Dublin, has helped CaliberAI build its model. He has a working formula for statements highly likely to be defamatory: They must implicitly or explicitly name an individual or group; present a claim as fact; and use some sort of taboo language or idea—like suggestions of theft, drunkenness, or other kinds of impropriety. If you feed a machine-learning algorithm a large enough sample of text, it will detect patterns and associations among negative words based on the company they keep. That will allow it to make intelligent guesses about which terms, if used about a specific group or person, place a piece of content into the defamation danger zone.
Logically enough, there was no data set of defamatory material sitting out there for CaliberAI to use, because publishers work very hard to avoid putting that stuff into the world. So the company built its own. Conor Brady started by drawing on his long experience in journalism to generate a list of defamatory statements. “We thought about all the nasty things that could be said about any person and we chopped, diced, and mixed them until we’d kind of run the whole gamut of human frailty,” he says. Then a group of annotators, overseen by Alan Reid and Abby Reynolds, a computational linguist and data linguist on the team, used the original list to build up a larger one. They use this made-up data set to train the AI to assign probability scores to sentences, from 0 (definitely not defamatory) to 100 (call your lawyer).
The result, so far, is something like spell-check for defamation. You can play with a demo version on the company’s website, which cautions that “you may notice false positives/negatives as we refine our predictive models.” I typed in “I believe John is a liar,” and the program spit out a probability of 40, below the defamation threshold. Then I tried “Everyone knows John is a liar,” and the program spit out a probability of 80 percent, flagging “Everyone knows” (statement of fact), “John” (specific person), and “liar” (negative language). Of course, that doesn’t quite settle the matter. In real life, my legal risk would depend on whether I can prove that John really is a liar.
“We are classifying on a linguistic level and returning that advisory to our customers,” says Paul Watson, the company’s chief technology officer. “Then our customers have to use their many years of experience to say, ‘Do I agree with this advisory?’ I think that’s a very important fact of what we’re building and trying to do. We’re not trying to build a ground-truth engine for the universe.”
It’s fair to wonder whether professional journalists really need an algorithm to warn that they might be defaming someone. “Any good editor or producer, any experienced journalist, ought to know it when he or she sees it,” says Sam Terilli, a professor at the University of Miami’s School of Communication and the former general counsel of the Miami Herald. “They ought to be able to at least identify those statements or passages that are potentially risky and worthy of a deeper look.”
That ideal might not always be in reach, however, especially during a period of thin budgets and heavy pressure to publish as quickly as possible.
“I think there’s a really interesting use case with news organizations,” says Amy Kristin Sanders, a media lawyer and journalism professor at the University of Texas. She points out the particular risks involved with reporting on breaking news, when a story might not go through a thorough editorial process. “For small- to medium-size newsrooms—who don’t have a general counsel present with them on a daily basis, who may rely on lots of freelancers, and who may be short staffed, so content is getting less of an editorial review than it has in the past—I do think there could be value in these kinds of tools.”
On the other hand, Sanders says, adopting a tool like CaliberAI could increase a publication’s legal exposure if it turned out that a journalist ignored a warning sign before publishing something defamatory. “I would not want my client to be the first publication to try this out,” she says. “Kudos to them for making it; let’s see what the courts think about this.”
The first set of CaliberAI users will be media organizations. The company is currently negotiating its first contract, with a chain of Irish newspapers owned by the Belgian publishing group Mediahuis. The far more interesting potential market, however, is not traditional media but social media. Neil Brady says he has had some preliminary conversations with major social networks. But as things stand, they have little reason to invest in something like CaliberAI’s software because they generally can’t get sued over user posts. The question is how long that will remain the case.
In the EU, under the upcoming Digital Services Act, platforms will be liable for illegal content that they know about and fail to remove. And in the US, the congressional debate around repealing or modifying Section 230 remains lively. (“We need Joe Biden to bring in liability,” Neil Brady half-jokes, when I ask about his company’s biggest challenges. “If he could just get on with that, that would be nice.”) Whatever final form these new liability rules take, platforms will almost certainly need new probabilistic methods of identifying and screening content that they mostly haven’t had to worry about so far, including defamation. In the case of CaliberAI, Neil Brady says, that could involve moderators using its tool, but it also could involve applying its analysis to steer users away from inflammatory posts in the first place.
“The internet is so dysfunctional in so many ways, and yet at the same time, there’s this very difficult balance to be struck between censorship and freedom of expression,” he says. “In the longer term, one of the big ways I can see that problem being addressed is the insertion of intelligent layers of technology like this, that essentially try and nudge better decisionmaking. It’s a kind of nudge-tech.”
One of the most compelling arguments made by people who support Section 230 immunity is that changing it would disproportionately hurt smaller platforms. Facebook can afford to expand moderation, the thinking goes, but newer companies might not. Startups like CaliberAI represent the other side of the coin. If legal changes force platforms to have more robust content moderation from the get-go, they won’t all build their systems from scratch. Companies like CaliberAI will proliferate to satisfy the startup market’s demand for moderation tools, in the same way many startups outsource payroll or other business functions.
It would be fitting if a team led by journalists helped shape the next phase of social media content moderation. Conor Brady, who teaches journalism at an Irish university, notes that the journalism profession is guided not just by legal pressures but by a set of values—like accuracy, impartiality, and independence—that date back to the late 19th century. Thinking on that timescale, it’s little wonder that social media hasn’t developed its own set of analogous norms. Conor likes to give his students a thought experiment. “Think about how you can actually take what are essentially 19th- and 20th-century editorial values and re-embed them, recast them in 21st-century technology,” he says. “It’s an easy thing to say it; it’s a damn difficult thing to actually put into effect.”