TY - JOUR
T1 - Recovering surface layout from an image
AU - Hoiem, Derek
AU - Efros, Alexei A.
AU - Hebert, Martial
N1 - Funding Information:
We would like to thank Rahul Sukthankar and Takeo Kanade for valuable discussions and references. We thank Jake Sprouse for providing images for the navigation application, Heegun Lee for annotating the indoor images, and our reviewers for helping to improve the clarity of the paper. This work is partially funded by NSF CAREER award IIS-0546547 and a Microsoft Research Fellowship to DH.
PY - 2007/10
Y1 - 2007/10
N2 - Humans have an amazing ability to instantly grasp the overall 3D structure of a scene-ground orientation, relative positions of major landmarks, etc.-even from a single image. This ability is completely missing in most popular recognition algorithms, which pretend that the world is flat and/or view it through a patch-sized peephole. Yet it seems very likely that having a grasp of this "surface layout" of a scene should be of great assistance for many tasks, including recognition, navigation, and novel view synthesis. In this paper, we take the first step towards constructing the surface layout, a labeling of the image intogeometric classes. Our main insight is to learn appearance-based models of these geometric classes, which coarsely describe the 3D scene orientation of each image region. Our multiple segmentation framework provides robust spatial support, allowing a wide variety of cues (e.g., color, texture, and perspective) to contribute to the confidence in each geometric label. In experiments on a large set of outdoor images, we evaluate the impact of the individual cues and design choices in our algorithm. We further demonstrate the applicability of our method to indoor images, describe potential applications, and discuss extensions to a more complete notion of surface layout.
AB - Humans have an amazing ability to instantly grasp the overall 3D structure of a scene-ground orientation, relative positions of major landmarks, etc.-even from a single image. This ability is completely missing in most popular recognition algorithms, which pretend that the world is flat and/or view it through a patch-sized peephole. Yet it seems very likely that having a grasp of this "surface layout" of a scene should be of great assistance for many tasks, including recognition, navigation, and novel view synthesis. In this paper, we take the first step towards constructing the surface layout, a labeling of the image intogeometric classes. Our main insight is to learn appearance-based models of these geometric classes, which coarsely describe the 3D scene orientation of each image region. Our multiple segmentation framework provides robust spatial support, allowing a wide variety of cues (e.g., color, texture, and perspective) to contribute to the confidence in each geometric label. In experiments on a large set of outdoor images, we evaluate the impact of the individual cues and design choices in our algorithm. We further demonstrate the applicability of our method to indoor images, describe potential applications, and discuss extensions to a more complete notion of surface layout.
KW - Context
KW - Geometric context
KW - Image understanding
KW - Model-driven segmentation
KW - Multiple segmentations
KW - Object detection
KW - Object recognition
KW - Scene understanding
KW - Spatial layout
KW - Surface layout
UR - http://www.scopus.com/inward/record.url?scp=34547216923&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34547216923&partnerID=8YFLogxK
U2 - 10.1007/s11263-006-0031-y
DO - 10.1007/s11263-006-0031-y
M3 - Article
AN - SCOPUS:34547216923
SN - 0920-5691
VL - 75
SP - 151
EP - 172
JO - International Journal of Computer Vision
JF - International Journal of Computer Vision
IS - 1
ER -