Thinking inside the box: Using appearance models and context based on room geometry

Varsha Hedau, Derek Hoiem, David Forsyth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper we show that a geometric representation of an object occurring in indoor scenes, along with rich scene structure can be used to produce a detector for that object in a single image. Using perspective cues from the global scene geometry, we first develop a 3D based object detector. This detector is competitive with an image based detector built using state-of-the-art methods; however, combining the two produces a notably improved detector, because it unifies contextual and geometric information. We then use a probabilistic model that explicitly uses constraints imposed by spatial layout - the locations of walls and floor in the image - to refine the 3D object estimates. We use an existing approach to compute spatial layout [1], and use constraints such as objects are supported by floor and can not stick through the walls. The resulting detector (a) has significantly improved accuracy when compared to the state-of-the-art 2D detectors and (b) gives a 3D interpretation of the location of the object, derived from a 2D image. We evaluate the detector on beds, for which we give extensive quantitative results derived from images of real scenes.

Original languageEnglish (US)
Title of host publicationComputer Vision, ECCV 2010 - 11th European Conference on Computer Vision, Proceedings
PublisherSpringer-Verlag
Pages224-237
Number of pages14
EditionPART 6
ISBN (Print)3642155669, 9783642155666
DOIs
StatePublished - Jan 1 2010
Event11th European Conference on Computer Vision, ECCV 2010 - Heraklion, Crete, Greece
Duration: Sep 10 2010Sep 11 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 6
Volume6316 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th European Conference on Computer Vision, ECCV 2010
CountryGreece
CityHeraklion, Crete
Period9/10/109/11/10

Fingerprint

Detector
Detectors
Geometry
Model
Layout
Geometric Representation
Context
Probabilistic Model
Object
Evaluate
Estimate

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Hedau, V., Hoiem, D., & Forsyth, D. (2010). Thinking inside the box: Using appearance models and context based on room geometry. In Computer Vision, ECCV 2010 - 11th European Conference on Computer Vision, Proceedings (PART 6 ed., pp. 224-237). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6316 LNCS, No. PART 6). Springer-Verlag. https://doi.org/10.1007/978-3-642-15567-3_17

Thinking inside the box : Using appearance models and context based on room geometry. / Hedau, Varsha; Hoiem, Derek; Forsyth, David.

Computer Vision, ECCV 2010 - 11th European Conference on Computer Vision, Proceedings. PART 6. ed. Springer-Verlag, 2010. p. 224-237 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6316 LNCS, No. PART 6).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Hedau, V, Hoiem, D & Forsyth, D 2010, Thinking inside the box: Using appearance models and context based on room geometry. in Computer Vision, ECCV 2010 - 11th European Conference on Computer Vision, Proceedings. PART 6 edn, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 6, vol. 6316 LNCS, Springer-Verlag, pp. 224-237, 11th European Conference on Computer Vision, ECCV 2010, Heraklion, Crete, Greece, 9/10/10. https://doi.org/10.1007/978-3-642-15567-3_17
Hedau V, Hoiem D, Forsyth D. Thinking inside the box: Using appearance models and context based on room geometry. In Computer Vision, ECCV 2010 - 11th European Conference on Computer Vision, Proceedings. PART 6 ed. Springer-Verlag. 2010. p. 224-237. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 6). https://doi.org/10.1007/978-3-642-15567-3_17
Hedau, Varsha ; Hoiem, Derek ; Forsyth, David. / Thinking inside the box : Using appearance models and context based on room geometry. Computer Vision, ECCV 2010 - 11th European Conference on Computer Vision, Proceedings. PART 6. ed. Springer-Verlag, 2010. pp. 224-237 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 6).
@inproceedings{6537de42104e439d94e898d2744a8cf7,
title = "Thinking inside the box: Using appearance models and context based on room geometry",
abstract = "In this paper we show that a geometric representation of an object occurring in indoor scenes, along with rich scene structure can be used to produce a detector for that object in a single image. Using perspective cues from the global scene geometry, we first develop a 3D based object detector. This detector is competitive with an image based detector built using state-of-the-art methods; however, combining the two produces a notably improved detector, because it unifies contextual and geometric information. We then use a probabilistic model that explicitly uses constraints imposed by spatial layout - the locations of walls and floor in the image - to refine the 3D object estimates. We use an existing approach to compute spatial layout [1], and use constraints such as objects are supported by floor and can not stick through the walls. The resulting detector (a) has significantly improved accuracy when compared to the state-of-the-art 2D detectors and (b) gives a 3D interpretation of the location of the object, derived from a 2D image. We evaluate the detector on beds, for which we give extensive quantitative results derived from images of real scenes.",
author = "Varsha Hedau and Derek Hoiem and David Forsyth",
year = "2010",
month = "1",
day = "1",
doi = "10.1007/978-3-642-15567-3_17",
language = "English (US)",
isbn = "3642155669",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag",
number = "PART 6",
pages = "224--237",
booktitle = "Computer Vision, ECCV 2010 - 11th European Conference on Computer Vision, Proceedings",
edition = "PART 6",

}

TY - GEN

T1 - Thinking inside the box

T2 - Using appearance models and context based on room geometry

AU - Hedau, Varsha

AU - Hoiem, Derek

AU - Forsyth, David

PY - 2010/1/1

Y1 - 2010/1/1

N2 - In this paper we show that a geometric representation of an object occurring in indoor scenes, along with rich scene structure can be used to produce a detector for that object in a single image. Using perspective cues from the global scene geometry, we first develop a 3D based object detector. This detector is competitive with an image based detector built using state-of-the-art methods; however, combining the two produces a notably improved detector, because it unifies contextual and geometric information. We then use a probabilistic model that explicitly uses constraints imposed by spatial layout - the locations of walls and floor in the image - to refine the 3D object estimates. We use an existing approach to compute spatial layout [1], and use constraints such as objects are supported by floor and can not stick through the walls. The resulting detector (a) has significantly improved accuracy when compared to the state-of-the-art 2D detectors and (b) gives a 3D interpretation of the location of the object, derived from a 2D image. We evaluate the detector on beds, for which we give extensive quantitative results derived from images of real scenes.

AB - In this paper we show that a geometric representation of an object occurring in indoor scenes, along with rich scene structure can be used to produce a detector for that object in a single image. Using perspective cues from the global scene geometry, we first develop a 3D based object detector. This detector is competitive with an image based detector built using state-of-the-art methods; however, combining the two produces a notably improved detector, because it unifies contextual and geometric information. We then use a probabilistic model that explicitly uses constraints imposed by spatial layout - the locations of walls and floor in the image - to refine the 3D object estimates. We use an existing approach to compute spatial layout [1], and use constraints such as objects are supported by floor and can not stick through the walls. The resulting detector (a) has significantly improved accuracy when compared to the state-of-the-art 2D detectors and (b) gives a 3D interpretation of the location of the object, derived from a 2D image. We evaluate the detector on beds, for which we give extensive quantitative results derived from images of real scenes.

UR - http://www.scopus.com/inward/record.url?scp=78149294425&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=78149294425&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-15567-3_17

DO - 10.1007/978-3-642-15567-3_17

M3 - Conference contribution

AN - SCOPUS:78149294425

SN - 3642155669

SN - 9783642155666

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 224

EP - 237

BT - Computer Vision, ECCV 2010 - 11th European Conference on Computer Vision, Proceedings

PB - Springer-Verlag

ER -