TY - GEN
T1 - Self-Training for Compositional Neural NLG in Task-Oriented Dialogue
AU - Li, Xintong
AU - Stevens-Guille, Symon Jory
AU - Maskharashvili, Aleksandre
AU - White, Michael
N1 - We thank that the Ohio Super Computer Center (Center, 1987) supports us sufficient computational devices for training many large models in our experiments. This research was supported by a collaborative open science research agreement between Facebook and The Ohio State University. The last author is a paid consultant for Facebook.
PY - 2021
Y1 - 2021
N2 - Neural approaches to natural language generation in task-oriented dialogue have typically required large amounts of annotated training data to achieve satisfactory performance, especially when generating from compositional inputs. To address this issue, we show that self-training enhanced with constrained decoding yields large gains in data efficiency on a conversational weather dataset that employs compositional meaning representations. In particular, our experiments indicate that self-training with constrained decoding can enable sequence-to-sequence models to achieve satisfactory quality using vanilla decoding with five to ten times less data than with ordinary supervised baseline; moreover, by leveraging pretrained models, data efficiency can be increased further to fifty times. We confirm the main automatic results with human evaluations and show that they extend to an enhanced, compositional version of the E2E dataset. The end result is an approach that makes it possible to achieve acceptable performance on compositional NLG tasks using hundreds rather than tens of thousands of training samples.
AB - Neural approaches to natural language generation in task-oriented dialogue have typically required large amounts of annotated training data to achieve satisfactory performance, especially when generating from compositional inputs. To address this issue, we show that self-training enhanced with constrained decoding yields large gains in data efficiency on a conversational weather dataset that employs compositional meaning representations. In particular, our experiments indicate that self-training with constrained decoding can enable sequence-to-sequence models to achieve satisfactory quality using vanilla decoding with five to ten times less data than with ordinary supervised baseline; moreover, by leveraging pretrained models, data efficiency can be increased further to fifty times. We confirm the main automatic results with human evaluations and show that they extend to an enhanced, compositional version of the E2E dataset. The end result is an approach that makes it possible to achieve acceptable performance on compositional NLG tasks using hundreds rather than tens of thousands of training samples.
UR - https://www.scopus.com/pages/publications/85118015226
UR - https://www.scopus.com/pages/publications/85118015226#tab=citedBy
U2 - 10.18653/v1/2021.inlg-1.10
DO - 10.18653/v1/2021.inlg-1.10
M3 - Conference contribution
AN - SCOPUS:85118015226
T3 - INLG 2021 - 14th International Conference on Natural Language Generation, Proceedings
SP - 87
EP - 102
BT - INLG 2021 - 14th International Conference on Natural Language Generation, Proceedings
A2 - Belz, Anya
A2 - Fan, Angela
A2 - Reiter, Ehud
A2 - Sripada, Yaji
PB - Association for Computational Linguistics (ACL)
T2 - 14th International Conference on Natural Language Generation, INLG 2021
Y2 - 20 September 2021 through 24 September 2021
ER -