TY - GEN
T1 - Linear spatial pyramid matching using sparse coding for image classification
AU - Yang, Jianchao
AU - Yu, Kai
AU - Gong, Yihong
AU - Huang, Thomas
PY - 2009
Y1 - 2009
N2 - Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n2 ∼ n3) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup the algorithms to handlemore than thousands of training images. In this paper we develop an extension of the SPM method, by generalizing vector quantization to sparse coding followed by multi-scale spatial max pooling, and propose a linear SPM kernel based on SIFT sparse codes. This new approach remarkably reduces the complexity of SVMs to O(n) in training and a constant in testing. In a number of image categorization experiments, we find that, in terms of classification accuracy, the suggested linear SPM based on sparse coding of SIFT descriptors always significantly outperforms the linear SPM kernel on histograms, and is even better than the nonlinear SPM kernels, leading to state-of-the-art performance on several benchmarks by using a single type of descriptors.
AB - Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n2 ∼ n3) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup the algorithms to handlemore than thousands of training images. In this paper we develop an extension of the SPM method, by generalizing vector quantization to sparse coding followed by multi-scale spatial max pooling, and propose a linear SPM kernel based on SIFT sparse codes. This new approach remarkably reduces the complexity of SVMs to O(n) in training and a constant in testing. In a number of image categorization experiments, we find that, in terms of classification accuracy, the suggested linear SPM based on sparse coding of SIFT descriptors always significantly outperforms the linear SPM kernel on histograms, and is even better than the nonlinear SPM kernels, leading to state-of-the-art performance on several benchmarks by using a single type of descriptors.
UR - http://www.scopus.com/inward/record.url?scp=70450209196&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70450209196&partnerID=8YFLogxK
U2 - 10.1109/CVPRW.2009.5206757
DO - 10.1109/CVPRW.2009.5206757
M3 - Conference contribution
AN - SCOPUS:70450209196
SN - 9781424439935
T3 - 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009
SP - 1794
EP - 1801
BT - 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009
PB - IEEE Computer Society
T2 - 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009
Y2 - 20 June 2009 through 25 June 2009
ER -