Recognizing high-level audio-visual concepts using context

M. R. Naphade, T. S. Huang

Research output: Contribution to conferencePaperpeer-review

Abstract

Recognition of high-level semantics from audio-visual data is a challenging multimedia understanding problem. The difficulty mainly lies in the gap that exists between low level media features and high level semantic concepts. In an attempt to bridge this gap we proposed a probabilistic framework for semantic understanding [6, 5]. The component of this framework are probabilistic multimedia objects and a graphical network of such objects. In this paper we show how the framework supports detection of multiple high-level concepts, which enjoy spatial and temporal support. More importantly, we show why context matters and how it can be modeled. Using a factor graph framework, we model context and use it to improve detection of sites, objects and events. Using concepts Outdoor and flying-helicopter we demonstrate how the factor graph multinet models context. Using ROC curves and probability of error curves we support the intuition that context should help.

Original languageEnglish (US)
Pages46-49
Number of pages4
StatePublished - 2001
Externally publishedYes
EventIEEE International Conference on Image Processing (ICIP) - Thessaloniki, Greece
Duration: Oct 7 2001Oct 10 2001

Other

OtherIEEE International Conference on Image Processing (ICIP)
Country/TerritoryGreece
CityThessaloniki
Period10/7/0110/10/01

ASJC Scopus subject areas

  • Hardware and Architecture
  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Recognizing high-level audio-visual concepts using context'. Together they form a unique fingerprint.

Cite this