Natural language processing
In recent years machines have learned to generate passable snippets of English, thanks to advances in artificial intelligence. Now they are moving on to other languages.
Aleph Alpha, a startup in Heidelberg, Germany, has built one of the world’s most powerful AI language models. Befitting the algorithm's European origins, it is fluent not just in English but also in German, French, Spanish, and Italian.
The algorithm builds on recent advances in machine learning that have helped computers handle language with what sometimes seems like real understanding. By drawing on what it has learned from reading the web, the algorithm can dream up coherent articles on a given subject and can answer some general knowledge questions cogently.
The answers, though, may differ from those produced by similar programs developed in the US. Asked about the best sports team in history, Aleph Alpha responds with a famous German soccer team. A US-built model is more likely to cite the Chicago Bulls or New York Yankees. Write the same query in French, and the answer will likely mention a famous French team, as the algorithm tunes its cultural perspective. Aleph Alpha is designed to be bilingual, meaning you can ask it a question in one language and get the answer in another.
“This is transformative AI,” says Jonas Andrulis, founder and CEO of Aleph Alpha, who previously worked on AI at Apple. “If Europe doesn't have the technical competence to build these systems, then we're relegated to being users of something from the US or China.”
After decades of slow progress in teaching machines to grasp the meaning of words and sentences, machine learning has produced some promising progress. Startups are rushing to spin gold out of AI’s growing language skills.
OpenAI, a US startup, was the first to showcase a powerful new kind of AI language model, called GPT-2, in 2019. It offers a new, more powerful version, GPT-3, to select startups and researchers through an API. A few other US companies, including Cohere and Anthropic, which was founded by alumni of OpenAI, are working on similar tools.
Now, a growing number of companies outside the US—in China, South Korea, and Israel as well as Germany—are building general-purpose AI language tools. Each effort has its own technical twists, but all are based on the same advances in machine learning.
The rise of AI programs that wield language in useful ways is partly about money. All sorts of things can be built on top of them: intelligent email assistants, programs that write useful computer code, and systems that generate marketing copy, to name a few.
Getting machines to grasp language has long been a grand challenge in AI. Language is so powerful because of the way words and concepts can be combined to confer a virtually infinite landscape of ideas and thoughts. But decoding the meaning of words can also be surprisingly difficult because of frequent ambiguity, and it’s impossible to write all the rules of language into a computer program (although some have tried).
Recent strides in AI show that machines can develop some notable language skills simply by reading the web.
In 2018, researchers at Google released details of a powerful new kind of large neural network specialized for natural language understanding called Bidirectional Encoder Representations from Transformers, or BERT. This showed that machine learning could yield new advances in language understanding and sparked efforts to explore the possibilities.
A year later, OpenAI demonstrated GPT-2, built by feeding a very large language model massive vast amounts of text from the web. This requires a huge amount of computer power, costing millions of dollars, by some estimates, and considerable engineering skill, but it seems to unlock a new level of understanding in the machine. GPT-2 and its successor GPT-3 can often generate paragraphs of coherent text on a given subject.
“What's surprising about these large language models is how much they know about how the world works simply from reading all the stuff that they can find,” says Chris Manning, a professor at Stanford who specializes in AI and language.
But GPT and its ilk are essentially very talented statistical parrots. They learn how to re-create the patterns of words and grammar that are found in language. That means they can blurt out nonsense, wildly inaccurate facts, and hateful language scraped from the darker corners of the web.
Amnon Shashua, a professor of computer science at the Hebrew University of Jerusalem, is the cofounder of another startup building an AI model based on this approach. He knows a thing or two about commercializing AI, having sold his last company, Mobileye, which pioneered using AI to help cars spot things on the road, to Intel in 2017 for $15.3 billion.
Shashua’s new company, AI21 Labs, which came out of stealth last week, has developed an AI algorithm, called Jurassic-1, that demonstrates striking language skills in both English and Hebrew.
In demos, Jurassic-1 can generate paragraphs of text on a given subject, dream up catchy headlines for blog posts, write simple bits of computer code, and more. Shashua says the model is more sophisticated than GPT-3, and he believes that future versions of Jurassic may be able to build a kind of common-sense understanding of the world from the information it gathers.
Other efforts to re-create GPT-3 reflect the world’s—and the internet’s—diversity of languages. In April, researchers at Huawei, the Chinese tech giant, published details of a GPT-like Chinese language model called PanGu-alpha (written as PanGu-α). In May, Naver, a South Korean search giant, said it had developed its own language model, called HyperCLOVA, that “speaks” Korean.
Jie Tang, a professor at Tsinghua University, leads a team at the Beijing Academy of Artificial Intelligence that developed another Chinese language model called Wudao (meaning "enlightenment'') with help from government and industry.
The Wudao model is considerably larger than any other, meaning that its simulated neural network is spread across more cloud computers. Increasing the size of the neural network was key to making GPT-2 and -3 more capable. Wudao can also work with both images and text, and Tang has founded a company to commercialize it. “We believe that this can be a cornerstone of all AI,” Tang says.
Such enthusiasm seems warranted by the capabilities of these new AI programs, but the race to commercialize such language models may also move more quickly than efforts to add guardrails or limit misuses.
Perhaps the most pressing worry about AI language models is how they might be misused. Because the models can churn out convincing text on a subject, some people worry that they could easily be used to generate bogus reviews, spam, or fake news.
“I would be surprised if disinformation operators don't at least invest serious energy experimenting with these models,” says Micah Musser, a research analyst at Georgetown University who has studied the potential for language models to spread misinformation.
Musser says research suggests that it won’t be possible to use AI to catch disinformation generated by AI. There’s unlikely to be enough information in a tweet for a machine to judge whether it was written by a machine.
More problematic kinds of bias may be lurking inside these gigantic language models, too. Research has shown that language models trained on Chinese internet content will reflect the censorship that shaped that content. The programs also inevitably capture and reproduce subtle and overt biases around race, gender, and age in the language they consume, including hateful statements and ideas.
Similarly, these big language models may fail in surprising or unexpected ways, adds Percy Liang, another computer science professor at Stanford and the lead researcher at a new center dedicated to studying the potential of powerful, general-purpose AI models like GPT-3.
Researchers at Liang’s center are developing their own massive language model to understand more about how these models actually work and how they can go wrong. “A lot of the amazing things that GPT-3 can do, even the designers didn't anticipate,” he says.
The companies developing these models promise to vet those who have access to them. Shashua says AI21 will have an ethics committee to review uses of its model. But as tools proliferate and become more accessible, it isn’t clear that all misuses would be caught.
Stella Biderman, an AI researcher behind an open source GPT-3 competitor called Eleuther, says it isn’t technically very difficult to replicate an AI model like GPT-3. The barrier to creating a powerful language model is shrinking for anyone with a few million dollars and a few machine learning graduates. Cloud computing platforms such as Amazon Web Services now offer anyone with enough money the tools that make it easier to build neural networks on the scale needed for something like GPT-3.
Tang, at Tsinghua, is designing his model to make use of a database of facts, to give it more grounding. But he’s not confident that will be enough to ensure the model does not misbehave. “I’m really not sure,” Tang says. “This is a big question for us and all the people working on these big models.”
Updated 8/23/21, 4:10 pm EDT: This story has been updated to correct the name of Amnon Shashua's startup from AI21 to AI21 Labs, and removed a reference that incorrectly described its AI model as “bilingual.”