Embodied one-shot video recognition: Learning from actions of a virtual embodied agent

Yuqian Fu, Chengrong Wang, Yanwei Fu, Yu Xiong Wang, Cong Bai, Xiangyang Xue, Yu Gang Jiang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

One-shot learning aims to recognize novel target classes from few examples by transferring knowledge from source classes, under a general assumption that the source and target classes are semantically related but not exactly the same. Based on this assumption, recent work has focused on image-based one-shot learning, while little work has addressed video-based one-shot learning. One of the challenges lies in that it is difficult to maintain the disjoint-class assumption for videos, since video clips of target classes may potentially appear in the videos of source classes. To address this issue, we introduce a novel setting, termed as embodied agents based one-shot learning, which leverages synthetic videos produced in a virtual environment to understand realistic videos of target classes. In this setting, we further propose two types of learning tasks: embodied one-shot video domain adaptation and embodied one-shot video transfer recognition. These tasks serve as a testbed for evaluating video related one-shot learning tasks. In addition, we propose a general video segment augmentation method, which significantly facilitates a variety of one-shot learning tasks. Experimental results validate the soundness of our setting and learning tasks, and also show the effectiveness of our augmentation approach to video recognition in the small-sample size regime.

Original languageEnglish (US)
Title of host publicationMM 2019 - Proceedings of the 27th ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery
Pages411-419
Number of pages9
ISBN (Electronic)9781450368896
DOIs
StatePublished - Oct 15 2019
Externally publishedYes
Event27th ACM International Conference on Multimedia, MM 2019 - Nice, France
Duration: Oct 21 2019Oct 25 2019

Publication series

NameMM 2019 - Proceedings of the 27th ACM International Conference on Multimedia

Conference

Conference27th ACM International Conference on Multimedia, MM 2019
Country/TerritoryFrance
CityNice
Period10/21/1910/25/19

Keywords

  • Embodied Agents
  • One-shot Learning
  • Video Action Recognition

ASJC Scopus subject areas

  • General Computer Science
  • Media Technology

Fingerprint

Dive into the research topics of 'Embodied one-shot video recognition: Learning from actions of a virtual embodied agent'. Together they form a unique fingerprint.

Cite this