TY - GEN
T1 - Schema-Guided Natural Language Generation
AU - Du, Yuheng
AU - Oraby, Shereen
AU - Perera, Vittorio
AU - Shen, Minmin
AU - Narayan-Chen, Anjali
AU - Chung, Tagyoung
AU - Venkatesh, Anu
AU - Hakkani-Tur, Dilek
N1 - Publisher Copyright:
© 2020 Association for Computational Linguistics
PY - 2020
Y1 - 2020
N2 - Neural network based approaches to data-to-text natural language generation (NLG) have gained popularity in recent years, with the goal of generating a natural language prompt that accurately realizes an input meaning representation. To facilitate the training of neural network models, researchers created large datasets of paired utterances and their meaning representations. However, the creation of such datasets is an arduous task and they mostly consist of simple meaning representations composed of slot and value tokens to be realized. These representations do not include any contextual information that an NLG system can use when trying to generalize, such as domain information and descriptions of slots and values. In this paper, we present the novel task of Schema-Guided Natural Language Generation (SG-NLG). Here, the goal is still to generate a natural language prompt, but in SG-NLG, the input MRs are paired with rich schemata providing contextual information. To generate a dataset for SG-NLG we re-purpose an existing dataset for another task: dialog state tracking, which includes a large and rich schema spanning multiple different attributes, including information about the domain, user intent, and slot descriptions. We train different state-of-the-art models for neural natural language generation on this dataset and show that in many cases, including rich schema information allows our models to produce higher quality outputs both in terms of semantics and diversity. We also conduct experiments comparing model performance on seen versus unseen domains, and present a human evaluation demonstrating high ratings for overall output quality.
AB - Neural network based approaches to data-to-text natural language generation (NLG) have gained popularity in recent years, with the goal of generating a natural language prompt that accurately realizes an input meaning representation. To facilitate the training of neural network models, researchers created large datasets of paired utterances and their meaning representations. However, the creation of such datasets is an arduous task and they mostly consist of simple meaning representations composed of slot and value tokens to be realized. These representations do not include any contextual information that an NLG system can use when trying to generalize, such as domain information and descriptions of slots and values. In this paper, we present the novel task of Schema-Guided Natural Language Generation (SG-NLG). Here, the goal is still to generate a natural language prompt, but in SG-NLG, the input MRs are paired with rich schemata providing contextual information. To generate a dataset for SG-NLG we re-purpose an existing dataset for another task: dialog state tracking, which includes a large and rich schema spanning multiple different attributes, including information about the domain, user intent, and slot descriptions. We train different state-of-the-art models for neural natural language generation on this dataset and show that in many cases, including rich schema information allows our models to produce higher quality outputs both in terms of semantics and diversity. We also conduct experiments comparing model performance on seen versus unseen domains, and present a human evaluation demonstrating high ratings for overall output quality.
UR - http://www.scopus.com/inward/record.url?scp=85116667734&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85116667734&partnerID=8YFLogxK
U2 - 10.18653/v1/2020.inlg-1.35
DO - 10.18653/v1/2020.inlg-1.35
M3 - Conference contribution
AN - SCOPUS:85116667734
T3 - INLG 2020 - 13th International Conference on Natural Language Generation, Proceedings
SP - 283
EP - 295
BT - INLG 2020 - 13th International Conference on Natural Language Generation, Proceedings
A2 - Davis, Brian
A2 - Graham, Yvette
A2 - Kelleher, John
A2 - Sripada, Yaji
PB - Association for Computational Linguistics (ACL)
T2 - 13th International Conference on Natural Language Generation, INLG 2020
Y2 - 15 December 2020 through 18 December 2020
ER -