6.6 C
New York
Friday, March 24, 2023

How Twitter’s New Safety Policies Are Created

Twitter unveiled a new policy Tuesday concerning “dehumanizing speech,” which will go into effect later this year. It’s designed to prohibit users from dehumanizing people based on their membership in a protected class, even if they’re not targeting a specific individual directly. In other words, Twitter is trying to curb the sexism and racism that users—especially minority groups—have complained about for years. The social media company develops new rules regularly, but this time it’s doing something different: asking the public what it thinks. For the next two weeks, anyone can answer Twitter’s form asking for feedback about whether the new policy is clear and how it might be improved.

This week, WIRED spoke with Del Harvey, Twitter's long-standing vice president of trust and safety, about why the company is changing its approach to dehumanizing speech. For more than a decade, Harvey has been Twitter’s go-to person when it comes to spam, abuse, threats of violence, and trolls. When she first started at Twitter in 2008—as employee number 25—Harvey was the only person working in the trust and safety department. Now she leads the team that helps decide where the line is between free speech and dangerous harassment on one of the world’s largest social media platforms. Here are her thoughts on Trump, trolls, and the company’s new focus on safety.

WIRED: I’m curious about why you wanted to solicit public feedback on the dehumanizing-speech policy. This is the first time to my knowledge—and please correct me if I’m wrong—in Twitter’s history that you have done this. So why now? Why ask people what they think about a policy like this?

Del Harvey: There are a whole bunch of reasons why we’re asking the public. One of the biggest ones is that, historically, we have been less transparent than, quite frankly, I think is ideal about our policies and how we develop them, the principles that guide their development. Our hope is that having this feedback period will serve to bring people along with us on the process. We obviously have historically solicited feedback from various NGOs in a given space or the subject matter experts on the Trust and Safety Council, or different regional groups, but even with that, there is very much always the possibility that we could miss something, that something wouldn’t get raised. What’s really the goal is trying to make sure that we are being thoughtful and that we do really try to capture all the possible variances that we might not think of inherently when we are coming up with these policies. We do really hope that the public feedback period might actually help us with that. This is the first time, absolutely.

WIRED: Is there a specific region or specific type of feedback that you’re looking for? Is it fair to say that you’re looking for things you haven’t thought of or that your partners haven’t thought of in this?

Harvey: I mean, actually, I feel like it’s kind of helpful to know whether people think, “Yeah, it sounds like you’ve got this covered.” Not just "Here’s something you haven’t thought of," but having a better understanding of what are people’s expectations in this space. Does this match with what people think should be the policy? Do they feel like this is even fair? Where do they think the line should be? That’s more than just “Oh, you missed something about this.” If we find that there’s a major disconnect, then it gives us an opportunity to figure out how can we do more around education. How can we do more around communication in this space? What can we do to try to proactively make sure people understand where we’re coming from with this and why we ended up where we did?

WIRED: Right, so it’s two parts. It’s the policy itself, but also making sure that the policy is communicated and people understand. I know that this is building upon the hateful-conduct policy, which already exists. It’s going from it being a specific attack against an individual to now, when this policy is in place, it will protect against attacks that target an entire group. Is that the way you’re thinking about it?

Harvey: To give a slight tweak on that, it’s actually around whether or not it’s actually targeted at someone. So, for example, if you sent me an @ reply that said, “All lesbians are scum and should die,” even though that didn’t say you are scum and you should die, that would still be a violation of our hateful-conduct policy today. But if it wasn't tweeted to me, that would still be a violation of this new dehumanization policy that we’re talking about. It’s taking away the requirement that an individual who is a member of the group being discussed be referenced in the tweet itself.

WIRED: How did you guys come up with this policy in the first place? I know that in August, when there was a debate about whether to remove Alex Jones, it came to light that you were working on this policy and you were choosing to expedite it. Why is it important that Twitter have something like this in place sooner rather than later?

Harvey: As we look at behavior on the platform and try to identify where there are behaviors that are problematic, especially when they potentially could lead to real-world harm, if they're not covered by our policies, then it’s something we want to try to figure out. Should they be covered by our policies? Generally speaking, my view is that if there is a behavior that is likely to result in real-world harm, we should be looking at what are the guardrails around something like that. How can we make sure that we're taking our role seriously in terms of trying to protect people?

Part of the reason I think we’re talking about this now is that we’ve made a lot of changes over the past couple of years to improve our enforcement options past just suspension. Some time ago, you were suspended or you were not suspended. That was pretty much all we had. We’ve tried to add a lot of new enforcement options. Now that we have some of those additional options, and now that we’re working on developing more, it gives us the opportunity to really dig into some areas that were maybe gray areas in the past that we didn’t have policy coverage for.

We obviously get reports from people about content that they believe violates our rules that does not. The dehumanizing content and the dehumanizing behavior is one of the areas that really makes up a significant chunk of those reports. We’ve gotten feedback not just in terms of the research that’s out there about potential real-world harms, we’ve gotten feedback from the people who use Twitter about this being something they view as deeply problematic. All of those things add together to say we should absolutely be trying to make sure we aren’t limiting how we think about our policies to just those that are dealing with whether an individual was specifically referenced.

WIRED: In the future, how will the dehumanizing-speech policy work in tandem with other policies that Twitter is working on? I know that you are looking at a policy for off-platform behavior, for example. That would potentially take into account someone’s behavior off Twitter when considering enforcement options.

Harvey: We have added new enforcement options, but we absolutely still need to develop more nuanced and tailored approaches to certain violations, which would include whether or not we would want potentially to factor in off-platform behavior when evaluating a specific violation.

One of the other pieces that goes along with all of this is trying to think about how we can provide both proactive and just-in-time checkpoints. If somebody is about to do something, it seems as though they’re headed down a bad path, how can we intervene to try to educate, or if they had already done that violation, to rehabilitate people?


We can come up with policies all day long. But unless we come up with a way of communicating them to people better and making sure they understand what our rules are, what behavior violates them, what the consequences are of that violation, what to expect from us—unless people really understand those things, our policies are going to be not enough. They’re not going to be a factor if people don't understand they exist in the first place. We are actually working on how we can completely overhaul the Twitter rules to try to make them more accessible and transparent.

WIRED: Beyond just overhauling the policies themselves, one thing I have been thinking about a lot is the actual interface, the way that Twitter works and the way that tweets work. Have you thought about things like news organizations embedding tweets, quote tweeting, or taking threads out of context? These are aspects of how Twitter works, and sometimes they end up amplifying abuse. Along with the rules, are you guys thinking about the actual way the platform functions and whether changes need to be made to keep people safer than they have been in the past?

Harvey: Yeah, absolutely. One of the challenges that is incredibly relevant is the challenge of context collapse. When you see a single tweet quoted out of context of whatever the rest of the conversation was, that is almost a perfect example of an opportunity for context collapse. One of the things that Jack [Dorsey, Twitter's CEO] has actually referenced is that we need to figure out what the incentives are that we have built into the product? Do we need to revisit them or reframe them to actually encourage and help people understand what healthy public conversation is, as opposed to incentivizing bad behaviors?

One other piece that really factors in here is that we also want to try to do more to provide meaningful transparency around our rules and processes and enforcement. How can we make sure people understand when enforcement actions have been taken? If you actually have no clue that we took an enforcement action, even if we shifted how the product works, you may just do some other variation of it without realizing what the reason was behind it.

WIRED: What about adding labels, like a label that says for example, “This is part of a thread.” Would you guys ever consider a label like that to bring back some of that context into the discussion?

Harvey: Totally. You could envision there being something like “Click here to view the thread.” There’s a whole bunch of different sort of in-product things we can explore in that space. But I think conversations like this, or suggestions like this, are actually sort of indicative of why we are trying to do more in terms of doing public comment periods, or soliciting feedback from a broader set of people than just the standard group, because there are so many different needs that people have. We really do want to serve the health of the public conversation. We want to make sure that people can express themselves freely on Twitter and feel safe doing so.

WIRED: With the context collapse you’re talking about, and everyone having their own experience, I wonder about how Twitter is dealing with being a global platform where there are so many different voices and different languages. Some of the most horrific consequences of the rise of these platforms, like Twitter and Facebook, have unfolded in the parts of the world that are the farthest from the United States. How are you thinking about the dehumanization policy and other policies in a global context?

Harvey: It’s something I think about a lot. We’ve always tried to develop policies with a global mind-set, but to have an understanding of the cultural nuances and the cultural context of how those policies actually impact people, we have historically talked to the members of the Trust and Safety Council. We also have a public policy team that works with various NGOs and groups that work on issues related to human rights and the protections that people really need. We work really closely with them as well through the policy development process. I think it’s pretty unlikely that people generally have any clue that we do proactive outreach to try to understand how our given policies might impact a given region.

It’s also incredibly important that we think about this as it affects different groups who use Twitter, to make sure we understand the potential consequences there too. It’s tremendously important, because even looking at something like the way men tend to experience abuse online versus the way women do, you see huge differences. That’s why making sure that we are actually getting feedback from people who have lived that is so important.

WIRED: That’s important work, and it’s also really resource-intensive. Over the past year, it seems publicly that Twitter has shifted its focus to prioritizing safety. Do you feel like that prioritization is reflected internally? Have you been able to make more hires, has your team gotten more preference or priority inside the company?

Harvey: I think the biggest shift is that we have really broadened the teams that work on these issues. Some years ago, there was sort of this [attitude] of “We’ll just handle this with policies.” You can’t just handle this with policies. It has to be a cross-functional effort. There has to be product involvement, you have to have research involved, you need design involved. You need all of the different stakeholders to be part of the conversation, because that shift means you can really start unlocking things that weren’t even possible before.


With the company shifting the focus to health, there has absolutely been a similar shift in terms of who's working on it. It’s not just lip service, where it’s like, "Yeah, we really care about health as a company." We care about health. It’s the top priority. If you look at the [objectives and key results] for the company, you see health called out as this is what we are going to get right.

WIRED: How are you anticipating or thinking about enforcing the dehumanizing-speech policy when it goes into effect? What kinds of resources and mechanisms will be used to ensure that dehumanizing speech that does violate Twitter’s rules is removed?

Harvey: We’re continuing to try to improve how our reporting process works. If we get reports, what is the context of those reports? Right now, we rely pretty heavily on reports to identify potential violations of Twitter’s rules. We are exploring how, along with all of the more nuanced enforcement options, we can further expand our user behavioral signals to try to identify and take action on violations more quickly. And then if we get it wrong, have that appeals process that helps people tell us exactly how and why we got it wrong and what we can do.

We’re also thinking about how we can do more in terms of transparency around helping people understand when something isn’t a violation, why it isn’t a violation. Something we have been thinking about is how we can develop more resources for people to understand what a violation looks like and what a violation is not, and here’s what’s different about the two. So that people can have that better understanding and we’re quite frankly doing a better job of meeting people’s expectations.

WIRED: It’s interesting to hear you say words like “rehabilitate,” and these words that put you into the shoes of the person who’s violating the rules, and figuring out how to make their experience better. It’s an important part of the puzzle that’s not always talked about. Obviously the person who is being abused should be the first priority, but it’s interesting to think about, how do you prevent people from breaking the rules in the future? How did you develop that perspective?

Harvey: The first concern is for the person who’s being targeted; we want to make sure they understand what recourse they have, how we can help them, how we can make sure they get to a good place. But for the people who are violating the rules, the vast majority, we’ve found, aren’t doing it deliberately. A lot of times they just didn’t even know that [something] violated our rules. Maybe they got into an argument that got too heated, or maybe they got a little carried away. We’ve found that 65 percent of people [who had their account functionality limited for a rule violation] weren’t in that state again, which is crazy if you think about it.

Again, the challenge of context collapse comes in with people who violate our rules too. They maybe didn’t mean to violate the rules, but context collapse meant that it ended up violating our rules because all parties involved weren’t on the same page. We can really shift to focusing on trying to get people to understand what our rules are, how we enforce them, what the consequences are of violating them, and what the pathway back to good is. And then for that really small subset of deliberate bad actors who are absolutely trying to break our rules and are not going to care about education and rehabilitation, that’s again where the more nuanced enforcement approach comes into play. We don’t need to give them the opportunity to prove that they can be a good citizen if they've already demonstrated that there is no good-faith willingness to engage there.

WIRED: It makes sense to rehabilitate people and to go through a process where you inform them about what’s going on. I think that can be really difficult, though, if you have a high-profile person, or the speech is really newsworthy. After this policy goes into place, what would happen were President Trump to make a comment like the one he made in May, when he compared some immigrants to animals? What would happen if you have a kind of newsworthy situation like that, but it conflicts with the dehumanizing-speech policy? You can’t necessarily not let the president tweet for 12 hours or something. What would happen in that situation?

Harvey: I mean, it’s tough to comment on hypotheticals. But quite frankly, that's an issue we want to explore in the comment period. Where do people see communication as being helpful here, to help them understand what action we did or did not take? Where should we be exploring expanding those enforcement options, and where is the right balance? It’s not set in stone, and it’s something we’re going to be continuing to explore. We are also are very much thinking about how we balance that against how people experience the platform as a whole.

This interview has been lightly edited and condensed for clarity.

Related Articles

Latest Articles