TY - GEN
T1 - Towards coherent and engaging spoken dialog response generation using automatic conversation evaluators
AU - Yi, Sanghyun
AU - Goel, Rahul
AU - Khatri, Chandra
AU - Cervone, Alessandra
AU - Chung, Tagyoung
AU - Hedayatnia, Behnam
AU - Venkatesh, Anu
AU - Gabriel, Raefer
AU - Hakkani-Tur, Dilek
N1 - Publisher Copyright:
© 2019 Association for Computational Linguistics
PY - 2019
Y1 - 2019
N2 - Encoder-decoder based neural architectures serve as the basis of state-of-the-art approaches in end-to-end open domain dialog systems. Since most of such systems are trained with a maximum likelihood (MLE) objective they suffer from issues such as lack of generalizability and the generic response problem, i.e., a system response that can be an answer to a large number of user utterances, e.g., “Maybe, I don’t know.” Having explicit feedback on the relevance and interestingness of a system response at each turn can be a useful signal for mitigating such issues and improving system quality by selecting responses from different approaches. Towards this goal, we present a system that evaluates chatbot responses at each dialog turn for coherence and engagement. Our system provides explicit turn-level dialog quality feedback, which we show to be highly correlated with human evaluation. To show that incorporating this feedback in the neural response generation models improves dialog quality, we present two different and complementary mechanisms to incorporate explicit feedback into a neural response generation model: reranking and direct modification of the loss function during training. Our studies show that a response generation model that incorporates these combined feedback mechanisms produce more engaging and coherent responses in an open-domain spoken dialog setting, significantly improving the response quality using both automatic and human evaluation.
AB - Encoder-decoder based neural architectures serve as the basis of state-of-the-art approaches in end-to-end open domain dialog systems. Since most of such systems are trained with a maximum likelihood (MLE) objective they suffer from issues such as lack of generalizability and the generic response problem, i.e., a system response that can be an answer to a large number of user utterances, e.g., “Maybe, I don’t know.” Having explicit feedback on the relevance and interestingness of a system response at each turn can be a useful signal for mitigating such issues and improving system quality by selecting responses from different approaches. Towards this goal, we present a system that evaluates chatbot responses at each dialog turn for coherence and engagement. Our system provides explicit turn-level dialog quality feedback, which we show to be highly correlated with human evaluation. To show that incorporating this feedback in the neural response generation models improves dialog quality, we present two different and complementary mechanisms to incorporate explicit feedback into a neural response generation model: reranking and direct modification of the loss function during training. Our studies show that a response generation model that incorporates these combined feedback mechanisms produce more engaging and coherent responses in an open-domain spoken dialog setting, significantly improving the response quality using both automatic and human evaluation.
UR - http://www.scopus.com/inward/record.url?scp=85087168694&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85087168694&partnerID=8YFLogxK
U2 - 10.18653/v1/W19-8608
DO - 10.18653/v1/W19-8608
M3 - Conference contribution
AN - SCOPUS:85087168694
T3 - INLG 2019 - 12th International Conference on Natural Language Generation, Proceedings of the Conference
SP - 65
EP - 75
BT - INLG 2019 - 12th International Conference on Natural Language Generation, Proceedings of the Conference
PB - Association for Computational Linguistics (ACL)
T2 - 12th International Conference on Natural Language Generation, INLG 2019
Y2 - 29 October 2019 through 1 November 2019
ER -