Efficient object localization with variation-normalized Gaussianized vectors

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Effective object localization relies on efficient and effective searching method, and robust image representation and learning method. Recently, the Gaussianized vector representation has been shown effective in several computer vision applications, such as facial age estimation, image scene categorization and video event recognition. However, all these tasks are classification and regression problems based on the whole images. It is not yet explored how this representation can be efficiently applied in the object localization, which reveals the locations and sizes of the objects. In this work, we present an efficient object localization approach for the Gaussianized vector representation, following a branch-and-bound search scheme introduced by Lampert et al. [5]. In particular, we design a quality bound for rectangle sets characterized by the Gaussianized vector representation for fast hierarchical search. This bound can be obtained for any rectangle set in the image, with little extra computational cost, in addition to calculating the Gaussianized vector representation for the whole image. Further, we propose incorporating a normalization approach that suppresses the variation within the object class and the background class. Experiments on a multi-scale car dataset show that the proposed object localization approach based on the Gaussianized vector representation outperforms previous work using the histogram-of-keywords representation. The within-class variation normalization approach further boosts the performance. This chapter is an extended version of our paper at the 1st International Workshop on Interactive Multimedia for Consumer Electronics at ACM Multimedia 2009 [16].

Original languageEnglish (US)
Title of host publicationIntelligent Video Event Analysis and Understanding
EditorsJianguo Zhang, Ling Shao, Lei Zhang, Graeme Jones
Pages93-109
Number of pages17
DOIs
StatePublished - Mar 4 2011

Publication series

NameStudies in Computational Intelligence
Volume332
ISSN (Print)1860-949X

Fingerprint

Consumer electronics
Computer vision
Railroad cars
Costs
Experiments

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this

Zhuang, X., Zhou, X., Hasegawa-Johnson, M. A., & Huang, T. S. (2011). Efficient object localization with variation-normalized Gaussianized vectors. In J. Zhang, L. Shao, L. Zhang, & G. Jones (Eds.), Intelligent Video Event Analysis and Understanding (pp. 93-109). (Studies in Computational Intelligence; Vol. 332). https://doi.org/10.1007/978-3-642-17554-1_5

Efficient object localization with variation-normalized Gaussianized vectors. / Zhuang, Xiaodan; Zhou, Xi; Hasegawa-Johnson, Mark Allan; Huang, Thomas S.

Intelligent Video Event Analysis and Understanding. ed. / Jianguo Zhang; Ling Shao; Lei Zhang; Graeme Jones. 2011. p. 93-109 (Studies in Computational Intelligence; Vol. 332).

Research output: Chapter in Book/Report/Conference proceedingChapter

Zhuang, X, Zhou, X, Hasegawa-Johnson, MA & Huang, TS 2011, Efficient object localization with variation-normalized Gaussianized vectors. in J Zhang, L Shao, L Zhang & G Jones (eds), Intelligent Video Event Analysis and Understanding. Studies in Computational Intelligence, vol. 332, pp. 93-109. https://doi.org/10.1007/978-3-642-17554-1_5
Zhuang X, Zhou X, Hasegawa-Johnson MA, Huang TS. Efficient object localization with variation-normalized Gaussianized vectors. In Zhang J, Shao L, Zhang L, Jones G, editors, Intelligent Video Event Analysis and Understanding. 2011. p. 93-109. (Studies in Computational Intelligence). https://doi.org/10.1007/978-3-642-17554-1_5
Zhuang, Xiaodan ; Zhou, Xi ; Hasegawa-Johnson, Mark Allan ; Huang, Thomas S. / Efficient object localization with variation-normalized Gaussianized vectors. Intelligent Video Event Analysis and Understanding. editor / Jianguo Zhang ; Ling Shao ; Lei Zhang ; Graeme Jones. 2011. pp. 93-109 (Studies in Computational Intelligence).
@inbook{905da3b913e34770a9ad0c50f819dd05,
title = "Efficient object localization with variation-normalized Gaussianized vectors",
abstract = "Effective object localization relies on efficient and effective searching method, and robust image representation and learning method. Recently, the Gaussianized vector representation has been shown effective in several computer vision applications, such as facial age estimation, image scene categorization and video event recognition. However, all these tasks are classification and regression problems based on the whole images. It is not yet explored how this representation can be efficiently applied in the object localization, which reveals the locations and sizes of the objects. In this work, we present an efficient object localization approach for the Gaussianized vector representation, following a branch-and-bound search scheme introduced by Lampert et al. [5]. In particular, we design a quality bound for rectangle sets characterized by the Gaussianized vector representation for fast hierarchical search. This bound can be obtained for any rectangle set in the image, with little extra computational cost, in addition to calculating the Gaussianized vector representation for the whole image. Further, we propose incorporating a normalization approach that suppresses the variation within the object class and the background class. Experiments on a multi-scale car dataset show that the proposed object localization approach based on the Gaussianized vector representation outperforms previous work using the histogram-of-keywords representation. The within-class variation normalization approach further boosts the performance. This chapter is an extended version of our paper at the 1st International Workshop on Interactive Multimedia for Consumer Electronics at ACM Multimedia 2009 [16].",
author = "Xiaodan Zhuang and Xi Zhou and Hasegawa-Johnson, {Mark Allan} and Huang, {Thomas S}",
year = "2011",
month = "3",
day = "4",
doi = "10.1007/978-3-642-17554-1_5",
language = "English (US)",
isbn = "9783642175534",
series = "Studies in Computational Intelligence",
pages = "93--109",
editor = "Jianguo Zhang and Ling Shao and Lei Zhang and Graeme Jones",
booktitle = "Intelligent Video Event Analysis and Understanding",

}

TY - CHAP

T1 - Efficient object localization with variation-normalized Gaussianized vectors

AU - Zhuang, Xiaodan

AU - Zhou, Xi

AU - Hasegawa-Johnson, Mark Allan

AU - Huang, Thomas S

PY - 2011/3/4

Y1 - 2011/3/4

N2 - Effective object localization relies on efficient and effective searching method, and robust image representation and learning method. Recently, the Gaussianized vector representation has been shown effective in several computer vision applications, such as facial age estimation, image scene categorization and video event recognition. However, all these tasks are classification and regression problems based on the whole images. It is not yet explored how this representation can be efficiently applied in the object localization, which reveals the locations and sizes of the objects. In this work, we present an efficient object localization approach for the Gaussianized vector representation, following a branch-and-bound search scheme introduced by Lampert et al. [5]. In particular, we design a quality bound for rectangle sets characterized by the Gaussianized vector representation for fast hierarchical search. This bound can be obtained for any rectangle set in the image, with little extra computational cost, in addition to calculating the Gaussianized vector representation for the whole image. Further, we propose incorporating a normalization approach that suppresses the variation within the object class and the background class. Experiments on a multi-scale car dataset show that the proposed object localization approach based on the Gaussianized vector representation outperforms previous work using the histogram-of-keywords representation. The within-class variation normalization approach further boosts the performance. This chapter is an extended version of our paper at the 1st International Workshop on Interactive Multimedia for Consumer Electronics at ACM Multimedia 2009 [16].

AB - Effective object localization relies on efficient and effective searching method, and robust image representation and learning method. Recently, the Gaussianized vector representation has been shown effective in several computer vision applications, such as facial age estimation, image scene categorization and video event recognition. However, all these tasks are classification and regression problems based on the whole images. It is not yet explored how this representation can be efficiently applied in the object localization, which reveals the locations and sizes of the objects. In this work, we present an efficient object localization approach for the Gaussianized vector representation, following a branch-and-bound search scheme introduced by Lampert et al. [5]. In particular, we design a quality bound for rectangle sets characterized by the Gaussianized vector representation for fast hierarchical search. This bound can be obtained for any rectangle set in the image, with little extra computational cost, in addition to calculating the Gaussianized vector representation for the whole image. Further, we propose incorporating a normalization approach that suppresses the variation within the object class and the background class. Experiments on a multi-scale car dataset show that the proposed object localization approach based on the Gaussianized vector representation outperforms previous work using the histogram-of-keywords representation. The within-class variation normalization approach further boosts the performance. This chapter is an extended version of our paper at the 1st International Workshop on Interactive Multimedia for Consumer Electronics at ACM Multimedia 2009 [16].

UR - http://www.scopus.com/inward/record.url?scp=79952096272&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79952096272&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-17554-1_5

DO - 10.1007/978-3-642-17554-1_5

M3 - Chapter

AN - SCOPUS:79952096272

SN - 9783642175534

T3 - Studies in Computational Intelligence

SP - 93

EP - 109

BT - Intelligent Video Event Analysis and Understanding

A2 - Zhang, Jianguo

A2 - Shao, Ling

A2 - Zhang, Lei

A2 - Jones, Graeme

ER -