Dialogue learning with human teaching and feedback in end-To-end trainable task-oriented dialogue systems

Bing Liu, Gokhan Tür, Dilek Hakkani-Tür, Pararth Shah, Larry Heck

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this work, we present a hybrid learning method for training task-oriented dialogue systems through online user interactions. Popular methods for learning task-oriented dialogues include applying reinforcement learning with user feedback on supervised pretraining models. Efficiency of such learning method may suffer from the mismatch of dialogue state distribution between offline training and online interactive learning stages. To address this challenge, we propose a hybrid imitation and reinforcement learning method, with which a dialogue agent can effectively learn from its interaction with users by learning from human teaching and feedback. We design a neural network based task-oriented dialogue agent that can be optimized end-Toend with the proposed learning method. Experimental results show that our end-To-end dialogue agent can learn effectively from the mistake it makes via imitation learning from user teaching. Applying reinforcement learning with user feedback after the imitation learning stage further improves the agent's capability in successfully completing a task.

Original languageEnglish (US)
Title of host publicationLong Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages2060-2069
Number of pages10
ISBN (Electronic)9781948087278
StatePublished - 2018
Externally publishedYes
Event2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018 - New Orleans, United States
Duration: Jun 1 2018Jun 6 2018

Publication series

NameNAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
Volume1

Conference

Conference2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2018
Country/TerritoryUnited States
CityNew Orleans
Period6/1/186/6/18

ASJC Scopus subject areas

  • Linguistics and Language
  • Language and Linguistics
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Dialogue learning with human teaching and feedback in end-To-end trainable task-oriented dialogue systems'. Together they form a unique fingerprint.

Cite this