Recovering surface layout from an image

Derek Hoiem, Alexei A. Efros, Martial Hebert

Research output: Contribution to journalArticlepeer-review

Abstract

Humans have an amazing ability to instantly grasp the overall 3D structure of a scene-ground orientation, relative positions of major landmarks, etc.-even from a single image. This ability is completely missing in most popular recognition algorithms, which pretend that the world is flat and/or view it through a patch-sized peephole. Yet it seems very likely that having a grasp of this "surface layout" of a scene should be of great assistance for many tasks, including recognition, navigation, and novel view synthesis. In this paper, we take the first step towards constructing the surface layout, a labeling of the image intogeometric classes. Our main insight is to learn appearance-based models of these geometric classes, which coarsely describe the 3D scene orientation of each image region. Our multiple segmentation framework provides robust spatial support, allowing a wide variety of cues (e.g., color, texture, and perspective) to contribute to the confidence in each geometric label. In experiments on a large set of outdoor images, we evaluate the impact of the individual cues and design choices in our algorithm. We further demonstrate the applicability of our method to indoor images, describe potential applications, and discuss extensions to a more complete notion of surface layout.

Original languageEnglish (US)
Pages (from-to)151-172
Number of pages22
JournalInternational Journal of Computer Vision
Volume75
Issue number1
DOIs
StatePublished - Oct 2007
Externally publishedYes

Keywords

  • Context
  • Geometric context
  • Image understanding
  • Model-driven segmentation
  • Multiple segmentations
  • Object detection
  • Object recognition
  • Scene understanding
  • Spatial layout
  • Surface layout

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Recovering surface layout from an image'. Together they form a unique fingerprint.

Cite this