Multi-scale orderless pooling of deep convolutional activation features

Yunchao Gong, Liwei Wang, Ruiqi Guo, Svetlana Lazebnik

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Deep convolutional neural networks (CNN) have shown their promise as a universal representation for recognition. However, global CNN activations lack geometric invariance, which limits their robustness for classification and matching of highly variable scenes. To improve the invariance of CNN activations without degrading their discriminative power, this paper presents a simple but effective scheme called multi-scale orderless pooling (MOP-CNN). This scheme extracts CNN activations for local patches at multiple scale levels, performs orderless VLAD pooling of these activations at each level separately, and concatenates the result. The resulting MOP-CNN representation can be used as a generic feature for either supervised or unsupervised recognition tasks, from image classification to instance-level retrieval; it consistently outperforms global CNN activations without requiring any joint training of prediction layers for a particular target dataset. In absolute terms, it achieves state-of-the-art results on the challenging SUN397 and MIT Indoor Scenes classification datasets, and competitive results on ILSVRC2012/2013 classification and INRIA Holidays retrieval datasets.

Original languageEnglish (US)
Title of host publicationComputer Vision, ECCV 2014 - 13th European Conference, Proceedings
PublisherSpringer
Pages392-407
Number of pages16
EditionPART 7
ISBN (Print)9783319105833
DOIs
StatePublished - 2014
Event13th European Conference on Computer Vision, ECCV 2014 - Zurich, Switzerland
Duration: Sep 6 2014Sep 12 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 7
Volume8695 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other13th European Conference on Computer Vision, ECCV 2014
Country/TerritorySwitzerland
CityZurich
Period9/6/149/12/14

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Multi-scale orderless pooling of deep convolutional activation features'. Together they form a unique fingerprint.

Cite this