Weakly supervised learning of object segmentations from web-scale video

  • Glenn Hartmann
  • , Matthias Grundmann
  • , Judy Hoffman
  • , David Tsai
  • , Vivek Kwatra
  • , Omid Madani
  • , Sudheendra Vijayanarasimhan
  • , Irfan Essa
  • , James Rehg
  • , Rahul Sukthankar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose to learn pixel-level segmentations of objects from weakly labeled (tagged) internet videos. Specifically, given a large collection of raw YouTube content, along with potentially noisy tags, our goal is to automatically generate spatiotemporal masks for each object, such as "dog", without employing any pre-trained object detectors. We formulate this problem as learning weakly supervised classifiers for a set of independent spatio-temporal segments. The object seeds obtained using segment-level classifiers are further refined using graphcuts to generate high-precision object masks. Our results, obtained by training on a dataset of 20,000 YouTube videos weakly tagged into 15 classes, demonstrate automatic extraction of pixel-level object masks. Evaluated against a ground-truthed subset of 50,000 frames with pixel-level annotations, we confirm that our proposed methods can learn good object masks just by watching YouTube.

Original languageEnglish (US)
Title of host publicationComputer Vision, ECCV 2012 - Workshops and Demonstrations, Proceedings
PublisherSpringer
Pages198-208
Number of pages11
EditionPART 1
ISBN (Print)9783642338625
DOIs
StatePublished - 2012
Externally publishedYes
EventComputer Vision, ECCV 2012 - Workshops and Demonstrations, Proceedings - Florence, Italy
Duration: Oct 7 2012Oct 13 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume7583 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceComputer Vision, ECCV 2012 - Workshops and Demonstrations, Proceedings
Country/TerritoryItaly
CityFlorence
Period10/7/1210/13/12

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Weakly supervised learning of object segmentations from web-scale video'. Together they form a unique fingerprint.

Cite this