Error prediction in spoken dialog: From signal-to-noise ratio to semantic confidence scores

Dilek Hakkani-Tür, Gokhan Tur, Giuseppe Riccardi, Hong Kook Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Spoken dialog systems aim to interpret meanings of users' utterances and respond to them accordingly. The users' utterances are first recognized by an automatic speech recognizer (ASR) and the intents of the users are extracted by the spoken language understanding (SLU) unit. Both ASR and SLU are noisy and in general their noise statistics are not correlated. Our goal is to exploit the signal-noise information and ASR lattice-based and semantic confidence scores for SLU error prediction and prevention of these by rejecting erroneous utterances, or asking confirmation questions. In our experiments, we have shown up to 80% relative decrease in the error rate of the accepted utterances collected using the AT&T How May I Help You™ Spoken Dialog System used for customer care.

Original languageEnglish (US)
Title of host publication2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Proceedings - Image and Multidimensional Signal Processing Multimedia Signal Processing
PublisherInstitute of Electrical and Electronics Engineers Inc.
PagesI1041-I1044
ISBN (Print)0780388747, 9780780388741
DOIs
StatePublished - 2005
Externally publishedYes
Event2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05 - Philadelphia, PA, United States
Duration: Mar 18 2005Mar 23 2005

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
VolumeI
ISSN (Print)1520-6149

Other

Other2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05
Country/TerritoryUnited States
CityPhiladelphia, PA
Period3/18/053/23/05

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Error prediction in spoken dialog: From signal-to-noise ratio to semantic confidence scores'. Together they form a unique fingerprint.

Cite this