Video scene segmentation using video and audio features

H. Sundaram, S. F. Chang

Research output: Contribution to conferencePaperpeer-review

Abstract

In this paper we present a novel algorithm for video scene segmentation. We model a scene as a semantically consistent chunk of audio-visual data. Central to the segmentation framework is the idea of a finite-memory model. We separately segment the audio and video data into scenes, using data in the memory. The audio segmentation algorithm determines the correlations amongst the envelopes of audio features. The video segmentation algorithm determines the correlations amongst. shot key-frames. The scene boundaries in both cases are determined using local correlation minima. Then, we fuse the resulting segments using a nearest neighbor algorithm that is further refined using a time-alignment distribution derived from the ground truth. The algorithm was tested on a difficult data set; the first hour of a commercial film with good results. It achieves a scene segmentation accuracy of 84%.

Original languageEnglish (US)
Pages1145-1148
Number of pages4
StatePublished - Dec 1 2000
Externally publishedYes
Event2000 IEEE International Conference on Multimedia and Expo (ICME 2000) - New York, NY, United States
Duration: Jul 30 2000Aug 2 2000

Other

Other2000 IEEE International Conference on Multimedia and Expo (ICME 2000)
CountryUnited States
CityNew York, NY
Period7/30/008/2/00

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Video scene segmentation using video and audio features'. Together they form a unique fingerprint.

Cite this