Tell me what you see and i will show you where it is

Jia Xu, Alexander G. Schwing, Raquel Urtasun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We tackle the problem of weakly labeled semantic segmentation, where the only source of annotation are image tags encoding which classes are present in the scene. This is an extremely difficult problem as no pixel-wise labelings are available, not even at training time. In this paper, we show that this problem can be formalized as an instance of learning in a latent structured prediction framework, where the graphical model encodes the presence and absence of a class as well as the assignments of semantic labels to superpixels. As a consequence, we are able to leverage standard algorithms with good theoretical properties. We demonstrate the effectiveness of our approach using the challenging SIFT-flow dataset and show average per-class accuracy improvements of 7% over the state-of-the-art.

Original languageEnglish (US)
Title of host publicationProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
PublisherIEEE Computer Society
Pages3190-3197
Number of pages8
ISBN (Electronic)9781479951178, 9781479951178
DOIs
StatePublished - Sep 24 2014
Externally publishedYes
Event27th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014 - Columbus, United States
Duration: Jun 23 2014Jun 28 2014

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Other

Other27th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014
CountryUnited States
CityColumbus
Period6/23/146/28/14

Keywords

  • Graphical Model
  • Semantic Segmentation
  • Structured Prediction
  • Weakly Supervised Learning

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint Dive into the research topics of 'Tell me what you see and i will show you where it is'. Together they form a unique fingerprint.

  • Cite this

    Xu, J., Schwing, A. G., & Urtasun, R. (2014). Tell me what you see and i will show you where it is. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 3190-3197). [6909804] (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). IEEE Computer Society. https://doi.org/10.1109/CVPR.2014.408