X
Innovation

Google Books and Scholar users beware: AI-generated nonsense is flooding search results

The book and academic search engines are including titles produced entirely by ChatGPT. Here's how to identify them.
Written by Artie Beaty, Contributing Writer
bookwallgettyimages-184268876
funky-data/Getty Images

Do you use Google Books to find books on certain topics?  Or Google Scholar to dive into academic research? Here's something you should know: These sites, which enable users to "search the world's most comprehensive index of full-text books" -- and search academic literature across any discipline -- have started indexing low-quality, AI-generated books that appear to be written by real, human authors.  

Also: Meta promises to better label AI-generated videos, images, and audio

This tip comes courtesy of 404 Media, which used a simple trick to track down AI-generated books. 

If you reference current events on ChatGPT, you'll often be greeted with the phrase, "As of my last knowledge update." That's just OpenAI's way of letting you know it has time constraints on what information it can access.

But, if you search that specific phrase -- "As of my last knowledge update" -- on Google Books, you'll run across books that apparently published content generated by ChatGPT verbatim.

A quick search for that phrase turned up page after page of titles. Some of the books were actually about ChatGPT and include that wording to show its limits, but dozens of others are trying to pass off the AI-generated writings as real.

Also: AI taking on more work doesn't mean it replaces you. Here are 12 reasons to worry less

For example, one book about the Boston Marathon bombing used the phrase "As of my last knowledge update in September 2021, the case continued to be subject to legal proceedings, and the ultimate outcome was still uncertain" when addressing the attack's perpetrators. The "author" of that book has 50 other works, including titles about the Cold War,  9/11, America's founding fathers, ancient Rome, famous boxers, and famous Native Americans.

Every one of those titles was published in 2023 (ZDNET's own Jack Wallen took 30 years to publish that many books) and was between 50 and 100 pages. Browsing through them, I found that every one offered superficial narratives that at best resembled a Wikipedia entry and at worst looked like ChatGPT was merely spitting out facts.

A quick search online revealed these books for sale at Amazon and other retailers. 

Also: This is why AI-powered misinformation is the top global risk

When I plugged the same phrase into Google Scholar, which is supposed to be a repository for human research, 19 pages of results were returned, including papers on at-risk youth, diabetes, autism, COVID-19, and airline pilot fatigue.

The fact that AI-generated content is out there is nothing new, but when it's showing up inside reliable resources like Google Books and Google Scholar alongside real ones, it's a bit worrisome. 

Speaking to 404 Media, Google said it would "continue to evaluate our approach as the world of book publishing evolves" but didn't mention removing these results from search. 

Editorial standards