Unsupervised and active learning in automatic speech recognition for call classification

Dilek Hakkani-Tür, Gokhan Tur, Mazin Rahim, Giuseppe Riccardi

Research output: Contribution to journalConference articlepeer-review

Abstract

A key challenge in rapidly building spoken natural language dialog applications is minimizing the manual effort required in transcribing and labeling speech data. This task is not only expensive but also time consuming. In this paper, we present a novel approach that aims at reducing the amount of manually transcribed in-domain data required for building automatic speech recognition (ASR) models in spoken language dialog systems. Our method is based on mining relevant text from various conversational systems and web sites. An iterative process is employed where the performance of the models can be improved through both unsupervised and active learning of the ASR models. We have evaluated the robustness of our approach on a call classification task that has been selected from AT&T VoiceToneSM customer care. Our results indicate that with unsupervised learning it is possible to achieve a call classification performance that is only 1.5% lower than the upper bound set when using all available in-domain transcribed data.

Original languageEnglish (US)
Pages (from-to)I429-I432
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume1
StatePublished - 2004
Externally publishedYes
EventProceedings - IEEE International Conference on Acoustics, Speech, and Signal Processing - Montreal, Que, Canada
Duration: May 17 2004May 21 2004

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Unsupervised and active learning in automatic speech recognition for call classification'. Together they form a unique fingerprint.

Cite this