TY - GEN
T1 - Dialguide
T2 - 2023 Findings of the Association for Computational Linguistics: EMNLP 2023
AU - Gupta, Prakhar
AU - Liu, Yang
AU - Jin, Di
AU - Hedayatnia, Behnam
AU - Gella, Spandana
AU - Liu, Sijia
AU - Lange, Patrick
AU - Hirschberg, Julia
AU - Hakkani-Tur, Dilek
N1 - Publisher Copyright:
© 2023 Association for Computational Linguistics.
PY - 2023
Y1 - 2023
N2 - Dialogue models are able to generate coherent and fluent responses, but they can still be challenging to control and may produce non-engaging, unsafe responses. This unpredictability diminishes user trust and can hinder the use of the models in the real world. To address this, we introduce DIALGUIDE, a novel framework for controlling dialogue model behavior using natural language rules, or guidelines. These guidelines provide information about the context they are applicable to and what should be included in the response, allowing the models to be more closely aligned with the developer's expectations and intent. We evaluate DIALGUIDE on three tasks in open-domain dialogue response generation: guideline selection, response generation, and response entailment verification. Our dataset contains 10,737 positive and 15,467 negative dialogue context-response-guideline triplets across two domains - chitchat and safety. We provide baseline models for the tasks and benchmark their performance. Our results demonstrate that DIALGUIDE is effective in producing safe and engaging responses that follow developer guidelines.
AB - Dialogue models are able to generate coherent and fluent responses, but they can still be challenging to control and may produce non-engaging, unsafe responses. This unpredictability diminishes user trust and can hinder the use of the models in the real world. To address this, we introduce DIALGUIDE, a novel framework for controlling dialogue model behavior using natural language rules, or guidelines. These guidelines provide information about the context they are applicable to and what should be included in the response, allowing the models to be more closely aligned with the developer's expectations and intent. We evaluate DIALGUIDE on three tasks in open-domain dialogue response generation: guideline selection, response generation, and response entailment verification. Our dataset contains 10,737 positive and 15,467 negative dialogue context-response-guideline triplets across two domains - chitchat and safety. We provide baseline models for the tasks and benchmark their performance. Our results demonstrate that DIALGUIDE is effective in producing safe and engaging responses that follow developer guidelines.
UR - http://www.scopus.com/inward/record.url?scp=85183300549&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85183300549&partnerID=8YFLogxK
U2 - 10.18653/v1/2023.findings-emnlp.935
DO - 10.18653/v1/2023.findings-emnlp.935
M3 - Conference contribution
AN - SCOPUS:85183300549
T3 - Findings of the Association for Computational Linguistics: EMNLP 2023
SP - 14031
EP - 14047
BT - Findings of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
Y2 - 6 December 2023 through 10 December 2023
ER -