In the weeks following the release of OpenAI’s viral chatbot ChatGPT late last year, Google AI chief Jeff Dean expressed concern that deploying a conversational search engine too quickly might pose a reputational risk for Alphabet. But last week Google announced its own chatbot, Bard, which in its first demo made a factual error about the James Webb Space Telescope.
Also last week, Microsoft integrated ChatGPT-based technology into Bing search results. Sarah Bird, Microsoft’s head of responsible AI, acknowledged that the bot could still “hallucinate” untrue information but said the technology had been made more reliable. In the days that followed, Bing claimed that running was invented in the 1700s and tried to convince one user that the year is 2022.
Alex Hanna sees a familiar pattern in these events—financial incentives to rapidly commercialize AI outweighing concerns about safety or ethics. There isn’t much money in responsibility or safety, but there’s plenty in overhyping the technology, says Hanna, who previously worked on Google’s Ethical AI team and is now head of research at nonprofit Distributed AI Research.
The race to make large language models—AI systems trained on massive amounts of data from the web to work with text—and the movement to make ethics a core part of the AI design process began around the same time. In 2018, Google launched the language model BERT, and before long Meta, Microsoft, and Nvidia had released similar projects based on the AI that is now part of Google search results. Also in 2018, Google adopted AI ethics principles said to limit future projects. Since then, researchers have warned that large language models carry heightened ethical risks and can spew or even intensify toxic, hateful speech. These models are also predisposed to making things up.
As startups and tech giants have attempted to build competitors to ChatGPT, some in the industry wonder whether the bot has shifted perceptions for when it’s acceptable or ethical to deploy AI powerful enough to generate realistic text and images.
OpenAI’s process for releasing models has changed in the past few years. Executives said the text generator GPT-2 was released in stages over months in 2019 due to fear of misuse and its impact on society (that strategy was criticized by some as a publicity stunt). In 2020, the training process for its more powerful successor, GPT-3, was well documented in public, but less than two months later OpenAI began commercializing the technology through an API for developers. By November 2022, the ChatGPT release process included no technical paper or research publication, only a blog post, a demo, and soon a subscription plan.
Irene Solaiman, policy director at open source AI startup Hugging Face, believes outside pressure can help hold AI systems like ChatGPT to account. She is working with people in academia and industry to create ways for nonexperts to perform tests on text and image generators to evaluate bias and other problems. If outsiders can probe AI systems, companies will no longer have an excuse to avoid testing for things like skewed outputs or climate impacts, says Solaiman, who previously worked at OpenAI on reducing the system’s toxicity.