AdaReNet: Adaptive Reweighted Semi-supervised Active Learning to Accelerate Label Acquisition

Ismini Lourentzou, Daniel Gruhl, Alfredo Alba, Anna Lisa Lisa Gentile, Petar Ristoski, Chad Deluca, Steven R. Welch, Chengxiang Zhai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Data scarcity and quality pose significant challenges to supervised learning. The process of generating informative annotations can be time-consuming and often requires high domain expertise. Active and semi-supervised learning methods can reduce labeling effort by either automatically expanding the training set or by selecting the most informative examples to request domain expert annotation. As most selection methods are heuristic, the performance varies widely across datasets and tasks. Bootstrapping approaches such as self-training can result in negative effects due to the addition of incorrectly pseudo-labeled instances. In this work, we take a holistic approach to label acquisition and consider the expansion of clean and pseudo-labeled subsets jointly. To address the challenge of producing high-quality pseudo-labels, we introduce a collaborative teacher-student framework, where the teacher, termed AdaReNet, learns a data-driven curriculum. Experimental results on several natural language processing (NLP) tasks demonstrate that the proposed framework outperforms baselines.

Original languageEnglish (US)
Title of host publication14th ACM International Conference on PErvasive Technologies Related to Assistive Environments, PETRA 2021
PublisherAssociation for Computing Machinery
Pages431-438
Number of pages8
ISBN (Electronic)9781450387927
DOIs
StatePublished - Jun 29 2021
Event14th ACM International Conference on PErvasive Technologies Related to Assistive Environments, PETRA 2021 - Virtual, Online, Greece
Duration: Jun 29 2021Jul 1 2021

Publication series

NameACM International Conference Proceeding Series

Conference

Conference14th ACM International Conference on PErvasive Technologies Related to Assistive Environments, PETRA 2021
Country/TerritoryGreece
CityVirtual, Online
Period6/29/217/1/21

Keywords

  • Active learning
  • Curriculum Learning
  • Information Extraction
  • Neural Networks
  • Pseudo-labeling
  • Self-training
  • Semi-supervised Learning
  • Sequence Labeling

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'AdaReNet: Adaptive Reweighted Semi-supervised Active Learning to Accelerate Label Acquisition'. Together they form a unique fingerprint.

Cite this