Cross-modal adaptation for RGB-D detection

Judy Hoffman, Saurabh Gupta, Jian Leong, Sergio Guadarrama, Trevor Darrell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper we propose a technique to adapt convolutional neural network (CNN) based object detectors trained on RGB images to effectively leverage depth images at test time to boost detection performance. Given labeled depth images for a handful of categories we adapt an RGB object detector for a new category such that it can now use depth images in addition to RGB images at test time to produce more accurate detections. Our approach is built upon the observation that lower layers of a CNN are largely task and category agnostic and domain specific while higher layers are largely task and category specific while being domain agnostic. We operationalize this observation by proposing a mid-level fusion of RGB and depth CNNs. Experimental evaluation on the challenging NYUD2 dataset shows that our proposed adaptation technique results in an average 21% relative improvement in detection performance over an RGB-only baseline even when no depth training data is available for the particular category evaluated. We believe our proposed technique will extend advances made in computer vision to RGB-D data leading to improvements in performance at little additional annotation effort.

Original languageEnglish (US)
Title of host publication2016 IEEE International Conference on Robotics and Automation, ICRA 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5032-5039
Number of pages8
ISBN (Electronic)9781467380263
DOIs
StatePublished - Jun 8 2016
Externally publishedYes
Event2016 IEEE International Conference on Robotics and Automation, ICRA 2016 - Stockholm, Sweden
Duration: May 16 2016May 21 2016

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
Volume2016-June
ISSN (Print)1050-4729

Other

Other2016 IEEE International Conference on Robotics and Automation, ICRA 2016
CountrySweden
CityStockholm
Period5/16/165/21/16

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Cross-modal adaptation for RGB-D detection'. Together they form a unique fingerprint.

  • Cite this

    Hoffman, J., Gupta, S., Leong, J., Guadarrama, S., & Darrell, T. (2016). Cross-modal adaptation for RGB-D detection. In 2016 IEEE International Conference on Robotics and Automation, ICRA 2016 (pp. 5032-5039). [7487708] (Proceedings - IEEE International Conference on Robotics and Automation; Vol. 2016-June). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICRA.2016.7487708