First-person action decomposition and zero-shot learning

Yun C. Zhang, Yin Li, James M. Rehg

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this work, we decompose a first-person action into verb and noun. We then study how the coupling of an action's constituent verb and noun affects the learners' ability to learn them separately and to combine them to perform recognition. We compare different information fusion methods on conventional action recognition and zero-shot learning, of which the latter is a strong indication of the feature's ability to capture one concept (verb/noun) and not be confounded by the other. To achieve the decoupling of verb/noun concepts, we extract features that are specialized for each of them. Specifically, we use improved dense trajectories and convolutional neural network activations. We show that by constructing specialized features for the decomposed concepts, our method succeeds in zero-shot learning. More surprisingly, it also outperforms previous results in conventional action recognition when the performance gaps of different features on verb/noun concepts are significant.

Original languageEnglish (US)
Title of host publicationProceedings - 2017 IEEE Winter Conference on Applications of Computer Vision, WACV 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages121-129
Number of pages9
ISBN (Electronic)9781509048229
DOIs
StatePublished - May 11 2017
Externally publishedYes
Event17th IEEE Winter Conference on Applications of Computer Vision, WACV 2017 - Santa Rosa, United States
Duration: Mar 24 2017Mar 31 2017

Publication series

NameProceedings - 2017 IEEE Winter Conference on Applications of Computer Vision, WACV 2017

Other

Other17th IEEE Winter Conference on Applications of Computer Vision, WACV 2017
Country/TerritoryUnited States
CitySanta Rosa
Period3/24/173/31/17

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'First-person action decomposition and zero-shot learning'. Together they form a unique fingerprint.

Cite this