Natural language processing
Earlier this year, Google artificial intelligence researcher Timnit Gebru sent a Twitter message to University of Washington professor Emily Bender. Gebru asked Bender if she had written about the ethical questions raised by recent advances in AI that processes text. Bender hadn’t, but the pair fell into a conversation about the limitations of such technology, such as evidence it can replicate biased language found online.
Bender found the DM discussion enlivening and suggested building it into an academic paper. “I hoped to provoke the next turn in the conversation,” Bender says. “We’ve seen all this excitement and success, let’s step back and see what the possible risks are and what we can do.” The draft was written in a month with five additional coauthors from Google and academia and was submitted in October to an academic conference. It would soon become one of the most notorious research works in AI.
By Tom Simonite
Last week, Gebru said she was fired by Google after objecting to a manager’s request to retract or remove her name from the paper. Google’s head of AI said the work “didn’t meet our bar for publication.” Since then, more than 2,200 Google employees have signed a letter demanding more transparency into the company’s handling of the draft. Saturday, Gebru’s manager, Google AI researcher Samy Bengio, wrote on Facebook that he was “stunned,” declaring “I stand by you, Timnit.” AI researchers outside Google have publicly castigated the company’s treatment of Gebru.
The furor gave the paper that catalyzed Gebru’s sudden exit an aura of unusual power. It circulated in AI circles like samizdat. But the most remarkable thing about the 12-page document, seen by WIRED, is how uncontroversial it is. The paper does not attack Google or its technology and seems unlikely to have hurt the company’s reputation if Gebru had been allowed to publish it with her Google affiliation.
The paper surveys previous research on the limitations of AI systems that analyze and generate language. It doesn’t present new experiments. The authors cite prior studies showing that language AI can consume vast amounts of electricity and echo unsavory biases found in online text. And they suggest ways AI researchers can be more careful with the technology, including by better documenting the data used to create such systems.
Google’s contributions to the field—some now deployed in its search engine—are referenced but not singled out for special criticism. One of the studies cited, showing evidence of bias in language AI, was published by Google researchers earlier this year.
“This article is a very solid and well-researched piece of work,” says Julien Cornebise, an honorary associate professor at University College London who has seen a draft of the paper. “It is hard to see what could trigger an uproar in any lab, let alone lead to someone losing their job over it.”
Google’s reaction might be evidence that company leaders feel more vulnerable to ethical critiques than Gebru and others realized—or that her departure was about more than just the paper. The company did not respond to a request for comment. In a blog post Monday, members of Google’s AI ethics research team suggested that managers had turned Google’s internal research-review process against Gebru. Gebru said last week that she may have been removed for criticizing Google’s diversity programs and suggesting in a recent group email that coworkers stop participating in them.
The draft paper that set the controversy in motion is titled “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” (It includes a parrot emoji after the question mark.) It turns a critical eye on one of the most lively strands of AI research.
Supersmart algorithms won't take all the jobs, But they are learning faster than ever, doing everything from medical diagnostics to serving up ads.
By Tom Simonite
Tech companies such as Google have invested heavily in AI since the early 2010s, when researchers discovered they could make speech and image recognition much more accurate using a technique called machine learning. These algorithms can refine their performance at a task, say transcribing speech, by digesting example data annotated with labels. An approach called deep learning enabled stunning new results by coupling learning algorithms with much larger collections of example data and more powerful computers.
In the past few years, researchers figured out how to super-scale machine learning models for language too. They showed major progress on tasks such as answering questions or generating text by having machine-learning algorithms digest billions of words of text scraped from the web. Those systems operate on the statistical patterns of language. They don’t understand the world in the way humans do and can still make blunders that seem obvious to a person. But they can number-crunch their way to impressive feats such as answering questions or generating fluid new text.
One such system, Google’s BERT, is used to improve how the company’s search engine handles long queries. Microsoft said it will license a system called GPT-3 from independent lab OpenAI that is also being tapped by entrepreneurs to write emails and ad copy.
That progress has prompted other researchers to question the limitations and possible societal effects of this new language technology. Gebru, Bender and their coauthors set out to draw this work together and suggest how the research community should respond.
By Tom Simonite
The authors point to previous research that calculated that training a large language model can consume as much energy as a car does from construction to junk yard, and a project that showed AI could mimic online conspiracy theorists
Another study cited by the paper was published by Google researchers earlier this year, and showed limitations of BERT, the company’s own language model. The team, which did not include Gebru, showed that BERT tended to associate phrases referring to disabilities such as cerebral palsy or blindness with negative language. All of the authors appear to still work at Google.
In the paper that precipitated Gebru’s exit, she and her coauthors urge AI developers to be more cautious with language projects. They recommend researchers do more to document the text used to create language AI and the limitations of systems made with it. They point readers to some recently proposed ideas for labeling AI systems with data on their accuracy and weaknesses. One cocreated by Gebru at Google is called model cards for model reporting and has been adopted by Google’s cloud division. The paper asks researchers building language systems to consider not only the perspective of AI developers, but also those of people outside the field who may be subjected to the systems’ outputs or judgments.
In his statement on Gebru’s departure last week claiming the paper was of poor quality, Google’s head of research, Jeff Dean, said it failed to cite research on making more efficient language models and ways to mitigate bias.
Bender says the authors included 128 citations and will likely add more. Such additions are common practice during the academic publishing process and are not usually reason to withdraw a paper. She and other AI researchers also say that despite Dean’s comment, the field is far from inventing a way to reliably eradicate language bias.
“That’s still work in progress because the bias takes many forms,” says Oren Etzioni, CEO of the Allen Institute for AI, which has done its own research on the topic, including some cited in the draft paper. “There’s a recognition from pretty much everybody who works in the field that these models are becoming increasingly influential and that we have an ethical obligation to deploy them responsibly.”