ScienceDaily
Your source for the latest research news
Follow Facebook Twitter LinkedIn Subscribe RSS Feeds Newsletters
New:
  • How the Brain Forms Sensory Memories
  • Healthy Sleep Habits Cut Risk of Heart Failure
  • NASA's SpaceX Crew-1 Astronauts Headed to ISS
  • Tree Rings and Supernovas
  • Hurricanes Reaching Further Inland
  • 'Volume Control' in Brain Supports Learning
  • Delayed Outbreaks of Endemic Diseases
  • Water May Be Present On All Rocky Planets
  • Eating Early in Day Does Not Impact Weight Loss
  • Rivers Melt Arctic Ice, Warming Air and Ocean
advertisement
Follow all of ScienceDaily's latest research news and top science headlines!
Science News
from research organizations

1

2

New test reveals AI still lacks common sense

Date:
November 18, 2020
Source:
University of Southern California
Summary:
Natural language processing (NLP) has taken great strides recently -- but how much does AI understand of what it reads? Less than we thought, it seems. Despite advances, AI still doesn't have the common sense needed to generate plausible sentences.
Share:
FULL STORY

Natural language processing (NLP) has taken great strides recently -- but how much does AI understand of what it reads? Less than we thought, according to researchers at USC's Department of Computer Science. In a recent paper Assistant Professor Xiang Ren and PhD student Yuchen Lin found that despite advances, AI still doesn't have the common sense needed to generate plausible sentences.

advertisement

"Current machine text-generation models can write an article that may be convincing to many humans, but they're basically mimicking what they have seen in the training phase," said Lin. "Our goal in this paper is to study the problem of whether current state-of-the-art text-generation models can write sentences to describe natural scenarios in our everyday lives."

Understanding scenarios in daily life

Specifically, Ren and Lin tested the models' ability to reason and showed there is a large gap between current text generation models and human performance. Given a set of common nouns and verbs, state-of-the-art NLP computer models were tasked with creating believable sentences describing an everyday scenario. While the models generated grammatically correct sentences, they were often logically incoherent.

For instance, here's one example sentence generated by a state-of-the-art model using the words "dog, frisbee, throw, catch":

"Two dogs are throwing frisbees at each other."

The test is based on the assumption that coherent ideas (in this case: "a person throws a frisbee and a dog catches it,") can't be generated without a deeper awareness of common-sense concepts. In other words, common sense is more than just the correct understanding of language -- it means you don't have to explain everything in a conversation. This is a fundamental challenge in the goal of developing generalizable AI -- but beyond academia, it's relevant for consumers, too.

advertisement

Without an understanding of language, chatbots and voice assistants built on these state-of-the-art natural-language models are vulnerable to failure. It's also crucial if robots are to become more present in human environments. After all, if you ask a robot for hot milk, you expect it to know you want a cup of mile, not the whole carton.

"We also show that if a generation model performs better on our test, it can also benefit other applications that need commonsense reasoning, such as robotic learning," said Lin. "Robots need to understand natural scenarios in our daily life before they make reasonable actions to interact with people."

Joining Lin and Ren on the paper are USC's Wangchunshu Zhou, Ming Shen, Pei Zhou; Chandra Bhagavatula from the Allen Institute of Artificial Intelligence; and Yejin Choi from the Allen Institute of Artificial Intelligence and Paul G. Allen School of Computer Science & Engineering, University of Washington.

The common sense test

Common-sense reasoning, or the ability to make inferences using basic knowledge about the world -- like the fact that dogs cannot throw frisbees to each other -- has resisted AI researchers' efforts for decades. State-of-the-art deep-learning models can now reach around 90% accuracy, so it would seem that NLP has gotten closer to its goal.

advertisement

But Ren, an expert in natural language processing and Lin, his student, needed more convincing about this statistic's accuracy. In their paper, published in the Findings of Empirical Methods in Natural Language Processing (EMNLP) conference on Nov. 16, they challenge the effectiveness of the benchmark and, therefore, the level of progress the field has actually made.

"Humans acquire the ability to compose sentences by learning to understand and use common concepts that they recognize in their surrounding environment," said Lin.

"Acquiring this ability is regarded as a major milestone in human development. But we wanted to test if machines can really acquire such generative commonsense reasoning ability."

To evaluate different machine models, the pair developed a constrained text generation task called CommonGen, which can be used as a benchmark to test the generative common sense of machines. The researchers presented a dataset consisting of 35,141 concepts associated with 77,449 sentences. They found the even best performing model only achieved an accuracy rate of 31.6% versus 63.5% for humans.

"We were surprised that the models cannot recall the simple commonsense knowledge that 'a human throwing a frisbee' should be much more reasonable than a dog doing it," said Lin. "We find even the strongest model, called the T5, after training with a large dataset, can still make silly mistakes."

It seems, said the researchers, that previous tests have not sufficiently challenged the models on their common sense abilities, instead mimicking what they have seen in the training phase.

"Previous studies have primarily focused on discriminative common sense," said Ren. "They test machines with multi-choice questions, where the search space for the machine is small -- usually four or five candidates."

For instance, a typical setting for discriminative common-sense testing is a multiple-choice question answering task, for example: "Where do adults use glue sticks?" A: classroom B: office C: desk drawer.

The answer here, of course, is "B: office." Even computers can figure this out without much trouble. In contrast, a generative setting is more open-ended, such as the CommonGen task, where a model is asked to generate a natural sentence from given concepts.

Ren explains: "With extensive model training, it is very easy to have a good performance on those tasks. Unlike those discriminative commonsense reasoning tasks, our proposed test focuses on the generative aspect of machine common sense."

Ren and Lin hope the data set will serve as a new benchmark to benefit future research about introducing common sense to natural language generation. In fact, they even have a leaderboard depicting scores achieved by the various popular models to help other researchers determine their viability for future projects.

"Robots need to understand natural scenarios in our daily life before they make reasonable actions to interact with people," said Lin.

"By introducing common sense and other domain-specific knowledge to machines, I believe that one day we can see AI agents such as Samantha in the movie Her that generate natural responses and interact with our lives."

make a difference: sponsored opportunity

Story Source:

Materials provided by University of Southern California. Original written by Caitlin Dawson. Note: Content may be edited for style and length.


Journal Reference:

  1. Bill Yuchen Lin, Wangchunshu Zhou, Ming Shen, Pei Zhou, Chandra Bhagavatula, Yejin Choi, Xiang Ren. CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning. submitted to arXiv, 2020 [abstract]

Cite This Page:

  • MLA
  • APA
  • Chicago
University of Southern California. "New test reveals AI still lacks common sense." ScienceDaily. ScienceDaily, 18 November 2020. <www.sciencedaily.com/releases/2020/11/201118141702.htm>.
University of Southern California. (2020, November 18). New test reveals AI still lacks common sense. ScienceDaily. Retrieved November 19, 2020 from www.sciencedaily.com/releases/2020/11/201118141702.htm
University of Southern California. "New test reveals AI still lacks common sense." ScienceDaily. www.sciencedaily.com/releases/2020/11/201118141702.htm (accessed November 19, 2020).

  • RELATED TOPICS
    • Computers & Math
      • Computer Modeling
      • Artificial Intelligence
      • Mathematical Modeling
      • Spintronics Research
    • Science & Society
      • STEM Education
      • Poverty and Learning
      • Educational Policy
      • World Development
advertisement

  • RELATED TERMS
    • Early childhood education
    • Random variable
    • Vehicle propulsion
    • Social science
    • Algebraic geometry
    • Paradox
    • Civil libertarianism
    • Justice

1

2

3

4

5
RELATED STORIES

'Multitasking' AI Tool to Extract Cancer Data in Record Time
Feb. 12, 2020 — To better leverage cancer data for research, scientists are developing an artificial intelligence (AI)-based natural language processing tool to improve information extraction from textual pathology ...
Study Clarifies How Neural Nets 'Think' When Processing Language
Sep. 11, 2017 — A new general-purpose technique has been developed for making sense of neural networks that are trained to perform natural-language-processing tasks, in which computers attempt to interpret freeform ...
Technique Shrinks Data Sets for Easier Analysis
Dec. 15, 2016 — A new coreset-generation technique has been presented by researchers that's tailored to a whole family of data analysis tools with applications in natural-language processing, computer vision, signal ...
Real Men Don't Say 'Cute'
Nov. 15, 2016 — What's in a tweet? From gender to education, the words used on social media carry impressions to others. Using publicly available tweets, social psychologists and computer scientists are helping us ...
FROM AROUND THE WEB

ScienceDaily shares links with sites in the TrendMD network and earns revenue from third-party advertisers, where indicated.
  Print   Email   Share

advertisement

1

2

3

4

5
Most Popular
this week

SPACE & TIME
(c) (c) CrispyMedia / AdobeTree Rings May Hold Clues to Impacts of Distant Supernovas on Earth
(c) NASA/JoelNASA's SpaceX Crew-1 Astronauts Headed to International Space Station
(c) (c) Sasa Kadrijevic / AdobeWater May Be Naturally Occurring on All Rocky Planets
MATTER & ENERGY
A Nanomaterial Path Forward for COVID-19 Vaccine Development
Connection Between Household Chemicals and Gut Microbiome
Cloth Face Masks That Can Be Disinfected by the Sun
COMPUTERS & MATH
Video Games Can Change Your Brain
System Brings Deep Learning to 'Internet of Things' Devices
Robotic AI Learns to Be Spontaneous
advertisement

Strange & Offbeat
 

SPACE & TIME
Blue Ring Nebula: 16-Year-Old Cosmic Mystery Solved, Revealing Stellar Missing Link
Ancient Zircon Minerals from Mars Reveal the Elusive Internal Structure of the Red Planet
Cosmic Flashes Come in All Different Sizes
MATTER & ENERGY
3D Bioprinted Heart Provides New Tool for Surgeons
Could Your Vacuum Be Listening to You?
Oil Droplet 'Predators' Chase Oil Droplet Prey
COMPUTERS & MATH
New Green Materials Could Power Smart Devices Using Ambient Light
Key Advance for Printing Circuitry on Wearable Fabrics
Stretchable 'Skin' Sensor Gives Robots Human Sensation
SD
  • SD
    • Home Page
    • Top Science News
    • Latest News
  • Home
    • Home Page
    • Top Science News
    • Latest News
  • Health
    • View all the latest top news in the health sciences,
      or browse the topics below:
      Health & Medicine
      • Allergy
      • Alternative Medicine
      • Birth Control
      • Cancer
      • Diabetes
      • Diseases
      • Heart Disease
      • HIV and AIDS
      • Obesity
      • Stem Cells
      • ... more topics
      Mind & Brain
      • ADD and ADHD
      • Addiction
      • Alzheimer's
      • Autism
      • Depression
      • Headaches
      • Intelligence
      • Psychology
      • Relationships
      • Schizophrenia
      • ... more topics
      Living Well
      • Parenting
      • Pregnancy
      • Sexual Health
      • Skin Care
      • Men's Health
      • Women's Health
      • Nutrition
      • Diet and Weight Loss
      • Fitness
      • Healthy Aging
      • ... more topics
  • Tech
    • View all the latest top news in the physical sciences & technology,
      or browse the topics below:
      Matter & Energy
      • Aviation
      • Chemistry
      • Electronics
      • Fossil Fuels
      • Nanotechnology
      • Physics
      • Quantum Physics
      • Solar Energy
      • Technology
      • Wind Energy
      • ... more topics
      Space & Time
      • Astronomy
      • Black Holes
      • Dark Matter
      • Extrasolar Planets
      • Mars
      • Moon
      • Solar System
      • Space Telescopes
      • Stars
      • Sun
      • ... more topics
      Computers & Math
      • Artificial Intelligence
      • Communications
      • Computer Science
      • Hacking
      • Mathematics
      • Quantum Computers
      • Robotics
      • Software
      • Video Games
      • Virtual Reality
      • ... more topics
  • Enviro
    • View all the latest top news in the environmental sciences,
      or browse the topics below:
      Plants & Animals
      • Agriculture and Food
      • Animals
      • Biology
      • Biotechnology
      • Endangered Animals
      • Extinction
      • Genetically Modified
      • Microbes and More
      • New Species
      • Zoology
      • ... more topics
      Earth & Climate
      • Climate
      • Earthquakes
      • Environment
      • Geography
      • Geology
      • Global Warming
      • Hurricanes
      • Ozone Holes
      • Pollution
      • Weather
      • ... more topics
      Fossils & Ruins
      • Ancient Civilizations
      • Anthropology
      • Archaeology
      • Dinosaurs
      • Early Humans
      • Early Mammals
      • Evolution
      • Lost Treasures
      • Origin of Life
      • Paleontology
      • ... more topics
  • Society
    • View all the latest top news in the social sciences & education,
      or browse the topics below:
      Science & Society
      • Arts & Culture
      • Consumerism
      • Economics
      • Political Science
      • Privacy Issues
      • Public Health
      • Racial Disparity
      • Religion
      • Sports
      • World Development
      • ... more topics
      Business & Industry
      • Biotechnology & Bioengineering
      • Computers & Internet
      • Energy & Resources
      • Engineering
      • Medical Technology
      • Pharmaceuticals
      • Transportation
      • ... more topics
      Education & Learning
      • Animal Learning & Intelligence
      • Creativity
      • Educational Psychology
      • Educational Technology
      • Infant & Preschool Learning
      • Learning Disorders
      • STEM Education
      • ... more topics
  • Quirky
    • Top News
    • Human Quirks
    • Odd Creatures
    • Bizarre Things
    • Weird World
Free Subscriptions

Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

  • Email Newsletters
  • RSS Feeds
Follow Us

Keep up to date with the latest news from ScienceDaily via social networks:

  • Facebook
  • Twitter
  • LinkedIn
Have Feedback?

Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?

  • Leave Feedback
  • Contact Us
About This Site  |  Staff  |  Reviews  |  Contribute  |  Advertise  |  Privacy Policy  |  Editorial Policy  |  Terms of Use
Copyright 2020 ScienceDaily or by other parties, where indicated. All rights controlled by their respective owners.
Content on this website is for information only. It is not intended to provide medical or other professional advice.
Views expressed here do not necessarily reflect those of ScienceDaily, its staff, its contributors, or its partners.
Financial support for ScienceDaily comes from advertisements and referral programs, where indicated.
— CCPA: Do Not Sell My Information — — GDPR: Privacy Settings —