Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information

Yen Ting Lin, Alexandros Papangelis, Seokhwan Kim, Sungjin Lee, Devamanyu Hazarika, Mahdi Namazifar, Di Jin, Yang Liu, Dilek Hakkani-Tur

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This work focuses on in-context data augmentation for intent detection. Having found that augmentation via in-context prompting of large pretrained language models (PLMs) alone does not improve performance, we introduce a novel approach based on PLMs and pointwise V-information (PVI), a metric that can measure the usefulness of a datapoint for training a model. Our method first fine-tunes a PLM on a small seed of training data and then synthesizes new datapoints - utterances that correspond to given intents. It then employs intent-aware filtering, based on PVI, to remove datapoints that are not helpful to the downstream intent classifier. Our method is thus able to leverage the expressive power of large language models to produce diverse training data. Empirical results demonstrate that our method can produce synthetic training data that achieve state-of-the-art performance on three challenging intent detection datasets under few-shot settings (1.28% absolute improvement in 5-shot and 1.18% absolute in 10-shot, on average) and perform on par with the state-of-the-art in full-shot settings (within 0.01% absolute, on average).

Original languageEnglish (US)
Title of host publicationEACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages1455-1468
Number of pages14
ISBN (Electronic)9781959429449
StatePublished - 2023
Externally publishedYes
Event17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023 - Dubrovnik, Croatia
Duration: May 2 2023May 6 2023

Publication series

NameEACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference

Conference

Conference17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023
Country/TerritoryCroatia
CityDubrovnik
Period5/2/235/6/23

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Software
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information'. Together they form a unique fingerprint.

Cite this