Hierarchical video clustering

Nemanja Petrovic, Nebojsa Jojic, Thomas S. Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution


We present a novel generative model for video that models video as mixture of transformed video scenes. The learning procedure automatically clusters video frames into video scenes and objects. The learning algorithm is based on a hierarchical, on-line EM algorithm. Fast Fourier transform (FFT) is used for rapid computations in E and M step of the EM algorithm. We use the model to: 1. perform video clustering by grouping similar (up to translation and scale) video frames into clusters; 2. robustly stabilize video by inferring translation and scale intensity for each frame. We believe that video scene modeling of this kind is essential to bridge the "semantic gap" in video understanding. We illustrate this with several excellent results, both in terms of speed and accuracy.

Original languageEnglish (US)
Title of host publication2004 IEEE 6th Workshop on Multimedia Signal Processing
Number of pages4
StatePublished - Dec 1 2004
Event2004 IEEE 6th Workshop on Multimedia Signal Processing - Siena, Italy
Duration: Sep 29 2004Oct 1 2004

Publication series

Name2004 IEEE 6th Workshop on Multimedia Signal Processing


Other2004 IEEE 6th Workshop on Multimedia Signal Processing

ASJC Scopus subject areas

  • Signal Processing
  • Engineering(all)

Fingerprint Dive into the research topics of 'Hierarchical video clustering'. Together they form a unique fingerprint.

Cite this