Let's DISCOH: Collecting an annotated open corpus with dialogue acts and reward signals for natural language helpdesks

G. Andreani, G. Di Fabbrizio, M. Gilbert, D. Gillick, D. Hakkani-Tür, O. Lemon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We motivate and explain the DISCOH project1, which uses a publicly deployed spoken dialogue system for conference services to collect a richly annotated corpus of mixed-initiative human- machine spoken dialogues. System users are able to call a phone number and learn about a conference, including paper submission, program, venue, accommodation options and costs, etc. The collected corpus is (1) usable for training, evaluating and comparing statistical models, (2) naturally spoken and task oriented, (3) extendible / generalizable, (4) collected using state-of-the-art research and commercial technology, (5) freely available to researchers. We explain the principles behind the dialogue context representations and reward signals collected by the system, as well as the overall system design, Call Types, and Call Flow. We also present results regarding the initial ASR models and spoken language understanding models. We expect the resulting corpora to be used in advanced dialogue research over the coming years.

Original languageEnglish (US)
Title of host publication2006 IEEE ACL Spoken Language Technology Workshop, SLT 2006, Proceedings
Pages218-221
Number of pages4
DOIs
StatePublished - 2006
Externally publishedYes
Event2006 IEEE ACL Spoken Language Technology Workshop, SLT 2006 - Palm Beach, Aruba
Duration: Dec 10 2006Dec 13 2006

Publication series

Name2006 IEEE ACL Spoken Language Technology Workshop, SLT 2006, Proceedings

Conference

Conference2006 IEEE ACL Spoken Language Technology Workshop, SLT 2006
Country/TerritoryAruba
CityPalm Beach
Period12/10/0612/13/06

Keywords

  • Learning systems
  • Natural language interfaces
  • Speech communication
  • User interfaces

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Let's DISCOH: Collecting an annotated open corpus with dialogue acts and reward signals for natural language helpdesks'. Together they form a unique fingerprint.

Cite this