Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel

Jianxin Wu, James M. Rehg

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Common visual codebook generation methods used in a Bag of Visual words model, e.g. k-means or Gaussian Mixture Model, use the Euclidean distance to cluster features into visual code words. However, most popular visual descriptors are histograms of image measurements. It has been shown that the Histogram Intersection Kernel (HIK) is more effective than the Euclidean distance in supervised learning tasks with histogram features. In this paper, we demonstrate that HIK can also be used in an unsupervised manner to significantly improve the generation of visual codebooks. We propose a histogram kernel k-means algorithm which is easy to implement and runs almost as fast as k-means. The HIK codebook has consistently higher recognition accuracy over k-means codebooks by 2-4%. In addition, we propose a one-class SVM formulation to create more effective visual code words which can achieve even higher accuracy. The proposed method has established new state-of-the-art performance numbers for 3 popular benchmark datasets on object and scene recognition. In addition, we show that the standard k-median clustering method can be used for visual codebook generation and can act as a compromise between HIK and k-means approaches.

Original languageEnglish (US)
Title of host publication2009 IEEE 12th International Conference on Computer Vision, ICCV 2009
Pages630-637
Number of pages8
DOIs
StatePublished - 2009
Externally publishedYes
Event12th International Conference on Computer Vision, ICCV 2009 - Kyoto, Japan
Duration: Sep 29 2009Oct 2 2009

Publication series

NameProceedings of the IEEE International Conference on Computer Vision

Other

Other12th International Conference on Computer Vision, ICCV 2009
Country/TerritoryJapan
CityKyoto
Period9/29/0910/2/09

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel'. Together they form a unique fingerprint.

Cite this