Tracking persons-of-interest via adaptive discriminative features

Shun Zhang, Yihong Gong, Jia Bin Huang, Jongwoo Lim, Jinjun Wang, Narendra Ahuja, Ming Hsuan Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Multi-face tracking in unconstrained videos is a challenging problem as faces of one person often appear drastically different in multiple shots due to significant variations in scale, pose, expression, illumination, and make-up. Low-level features used in existing multitarget tracking methods are not effective for identifying faces with such large appearance variations. In this paper, we tackle this problem by learning discriminative, video-specific face features using convolutional neural networks (CNNs). Unlike existing CNN-based approaches that are only trained on large-scale face image datasets offline, we further adapt the pre-trained face CNN to specific videos using automatically discovered training samples from tracklets. Our network directly optimizes the embedding space so that the Euclidean distances correspond to a measure of semantic face similarity. This is technically realized by minimizing an improved triplet loss function. With the learned discriminative features, we apply the Hungarian algorithm to link tracklets within each shot and the hierarchical clustering algorithm to link tracklets across multiple shots to form final trajectories. We extensively evaluate the proposed algorithm on a set of TV sitcoms and music videos and demonstrate significant performance improvement over existing techniques.

Original languageEnglish (US)
Title of host publicationComputer Vision - 14th European Conference, ECCV 2016, Proceedings
EditorsBastian Leibe, Jiri Matas, Nicu Sebe, Max Welling
PublisherSpringer
Pages415-433
Number of pages19
ISBN (Print)9783319464534
DOIs
StatePublished - 2016
Event14th European Conference on Computer Vision, ECCV 2016 - Amsterdam, Netherlands
Duration: Oct 11 2016Oct 14 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9909 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other14th European Conference on Computer Vision, ECCV 2016
Country/TerritoryNetherlands
CityAmsterdam
Period10/11/1610/14/16

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Tracking persons-of-interest via adaptive discriminative features'. Together they form a unique fingerprint.

Cite this