Linear spatial pyramid matching using sparse coding for image classification

Jianchao Yang, Kai Yu, Yihong Gong, Thomas Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n2 ∼ n3) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup the algorithms to handlemore than thousands of training images. In this paper we develop an extension of the SPM method, by generalizing vector quantization to sparse coding followed by multi-scale spatial max pooling, and propose a linear SPM kernel based on SIFT sparse codes. This new approach remarkably reduces the complexity of SVMs to O(n) in training and a constant in testing. In a number of image categorization experiments, we find that, in terms of classification accuracy, the suggested linear SPM based on sparse coding of SIFT descriptors always significantly outperforms the linear SPM kernel on histograms, and is even better than the nonlinear SPM kernels, leading to state-of-the-art performance on several benchmarks by using a single type of descriptors.

Original languageEnglish (US)
Title of host publication2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009
PublisherIEEE Computer Society
Pages1794-1801
Number of pages8
ISBN (Print)9781424439935
DOIs
StatePublished - 2009
Event2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009 - Miami, FL, United States
Duration: Jun 20 2009Jun 25 2009

Publication series

Name2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009

Other

Other2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009
Country/TerritoryUnited States
CityMiami, FL
Period6/20/096/25/09

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Biomedical Engineering

Fingerprint

Dive into the research topics of 'Linear spatial pyramid matching using sparse coding for image classification'. Together they form a unique fingerprint.

Cite this