Natural language processing
Google often uses its annual developer conference, I/O, to showcase artificial intelligence with a wow factor. In 2016, it introduced the Google Home smart speaker with Google Assistant. In 2018, Duplex debuted to answer calls and schedule appointments for businesses. In keeping with that tradition, last month CEO Sundar Pichai introduced LaMDA, AI “designed to have a conversation on any topic.”
In an onstage demo, Pichai demonstrated what it’s like to converse with a paper airplane and the celestial body Pluto. For each query, LaMDA responded with three or four sentences meant to resemble a natural conversation between two people. Over time, Pichai said, LaMDA could be incorporated into Google products including Assistant, Workspace, and most crucially, search.
“We believe LaMDA’s natural conversation capabilities have the potential to make information and computing radically more accessible and easier to use,” Pichai said.
The LaMDA demonstration offers a window into Google’s vision for search that goes beyond a list of links and could change how billions of people search the web. That vision centers on AI that can infer meaning from human language, engage in conversation, and answer multifaceted questions like an expert.
Also at I/O, Google introduced another AI tool, dubbed Multitask Unified Model (MUM), which can consider searches with text and images. VP Prabhakar Raghavan said users someday could take a picture of a pair of shoes and ask the search engine whether the shoes would be good to wear while climbing Mount Fuji.
MUM generates results across 75 languages, which Google claims gives it a more comprehensive understanding of the world. A demo onstage showed how MUM would respond to the search query “I’ve hiked Mt. Adams and now want to hike Mt. Fuji next fall, what should I do differently?” That search query is phrased differently than you probably search Google today because MUM is meant to reduce the number of searches needed to find an answer. MUM can both summarize and generate text; it will know to compare Mount Adams to Mount Fuji and that trip prep may require search results for fitness training, hiking gear recommendations, and weather forecasts.
In a paper titled “Rethinking Search: Making Experts Out of Dilettantes,” published last month, four engineers from Google Research envisioned search as a conversation with human experts. An example in the paper considers the search “What are the health benefits and risks of red wine?” Today, Google replies with a list of bullet points. The paper suggests a future response might look more like a paragraph saying red wine promotes cardiovascular health but stains your teeth, complete with mentions of—and links to—the sources for the information. The paper shows the reply as text, but it’s easy to imagine oral responses as well, like the experience today with Google Assistant.
But relying more on AI to decipher text also carries risks, because computers still struggle to understand language in all its complexity. The most advanced AI for tasks such as generating text or answering questions, known as large language models, have shown a propensity to amplify bias and to generate unpredictable or toxic text. One such model, OpenAI’s GPT-3, has been used to create interactive stories for animated characters but also has generated text about sex scenes involving children in an online game.
As part of a paper and demo posted online last year, researchers from MIT, Intel, and Facebook found that large language models exhibit biases based on stereotypes about race, gender, religion, and profession.
Rachael Tatman, a linguist with a PhD in the ethics of natural language processing, says that as the text generated by these models grows more convincing, it can lead people to believe they’re speaking with AI that understands the meaning of the words that it’s generating—when in fact it has no common-sense understanding of the world. That can be a problem when it generates text that’s toxic to people with disabilities or Muslims or tells people to commit suicide. Growing up, Tatman recalls being taught by a librarian how to judge the validity of Google search results. If Google combines large language models with search, she says, users will have to learn how to evaluate conversations with expert AI.
Google is a company built on PageRank, an algorithm created from research by company cofounders Larry Page and Sergey Brin in the late 1990s. It relies on indexing, a process using algorithms to sort and evaluate websites. Over time, Google incorporated its Knowledge Graph—a huge reservoir of facts—into search results.
More recently, Google incorporated language models into its search replies. In 2019, the company injected a model it calls BERT into search to answer conversational search queries, suggest searches, and summarize the text that appears below a search result. At the time, Google VP Pandu Nayak called it the biggest advance in search in five years and “one of the biggest leaps forward in the history of search.” BERT also powers search results for Microsoft’s Bing.
BERT’s introduction in 2018 kicked off a race among tech giants to create ever bigger language models and inch higher in popular performance leaderboards like GLUE on tasks like language understanding or answering questions. Soon after, Baidu introduced Ernie, Nvidia made Megatron, Microsoft made T-NLG, and OpenAI made GPT-3. Engineers often assess these models by the number of parameters, a measurement of connections between artificial neurons in a deep learning system. BERT included hundreds of millions of parameters, and GPT-3 has 175 billion. In January, Google released a 1-trillion-parameter language model. At Google’s I/O event, Raghavan called MUM 1,000 times more powerful than BERT based on the number of parameters.
In the “Rethinking Search” paper, the Google researchers call indexing the workhorse of modern search. But they envision doing away with indexing by using ever-larger language models that can understand more queries.
The Knowledge Graph, for example, may serve up answers to factual questions, but it’s trained on only a small portion of the web. Using a language model built from more of the web would allow a search engine to make recommendations, retrieve documents, answer questions, and accomplish a wide range of tasks. The authors of the Rethinking Search paper say the approach has the potential to create a “transformational shift in thinking.”
Such a model doesn’t exist. In fact the authors say it may require the creation of artificial general intelligence or advances in fields like information retrieval and machine learning. Among other things, they want the new approach to supply authoritative answers from a diversity of perspectives, clearly reveal its sources, and operate without bias.
A Google spokesperson described LaMDA and MUM as part of Google’s research into next-generation language models and said internal pilots are underway for MUM to help people with queries on billions of topics. Asked about the “Rethinking Search” paper and its relationship to LaMDA and MUM, the spokesperson said Google Research does not set directions for Google products and that machine learning that makes its way into Google products like search typically supplements rather than replaces existing products.
Any change in Google’s search algorithms would inevitably affect its core advertising business, which generated $147 billion in revenue last year. Search consultant Michael Blumenthal says the MUM demonstration about hiking boots suggests Google wants to play an even bigger role connecting businesses with consumers. In another change last month, Google introduced a Shopify integration to bring the wares of 1.7 million merchants into search. Food delivery companies DoorDash and Postmates entered search results in 2019.
Blumenthal, who has been consulting for businesses on search strategies for 20 years, notes that Google search results have evolved from a list of links served up by PageRank to include ads, knowledge panels, maps, videos, and augmented reality.
That shift has led to the rise of what some call zero-click searches, instances where people no longer click through to a website to finish a web search. This gives Google the ability to capture ad revenue without users leaving Google to visit the rest of the web. Digital data company Similarweb estimates that users did not click through to another page on nearly two-thirds of Google searches last year; click-through rates are particularly low on mobile devices.
“From where I sit, their ambitions are much bigger than the display ad world,” Blumenthal says about changes to search being considered by Google. “They love connecting parties for transactions, so I see this as just enhancing that immensely.”
Changes that emphasize search with natural language or images could shift users away from a focus on keywords and also disrupt the multi-billion-dollar business of search engine optimization, where businesses vie to appear closer to the top of search results.
Some search-optimization companies have been preparing for the natural language future. Copysmith.ai, a startup based in Birmingham, Alabama, uses GPT-3 to, among other things, generate SEO metatags for websites. CEO Shegun Otulana says the company doesn't see Google's recent moves “as a threat, but as further progress for the whole AI space. It confirms we are moving in the right direction.”
Blumenthal said splashy I/O announcements can take years to fulfill their promise but says it’s increasingly clear that Google wants to be more than a collection of facts and links—and more like an expert capable of answering complex questions. “The only question is when will they get there,” he said.
Google’s approach to large language models as a business strategy and research focus has led to conflict inside the company. Most notably, the two former coleaders of Google’s Ethical AI team, Timnit Gebru and Margaret Mitchell, were forced out after coauthoring a paper highlighting concerns about such models. Among other things, the paper cited research showing that large language models perpetuate bias and stereotypes and can contribute to climate change. The paper says poor data labeling and curation practices become bigger problems as language models get bigger. It also crucially points out that the dangers to society wrought by large language are most likely to fall on marginalized communities.
In January, the author of another recent AI research paper critical of large language models characterized Google legal and policy team interference as “deeply insidious.” In March, researchers from Google’s DeepMind found that large language models can harm society without any malicious intent on the part of the creator through the spread of stereotypes, job loss, and disinformation.
Updated, 6-7-21, 11:50am ET: This article was updated with information from Copysmith.ai.