Automatic detection of auditory salience with optimized linear filters derived from human annotation

Kyungtae Kim, Kai Hsiang Lin, Dirk B. Walther, Mark A. Hasegawa-Johnson, Tomas S. Huang

Research output: Contribution to journalArticlepeer-review

Abstract

Auditory salience describes how much a particular auditory event attracts human attention. Previous attempts at automatic detection of salient audio events have been hampered by the challenge of defining ground truth. In this paper ground truth for auditory salience is built up from annotations by human subjects of a large corpus of meeting room recordings. Following statistical purification of the data, an optimal auditory salience filter with linear discrimination is derived from the purified data. An automatic auditory salience detector based on optimal filtering of the Bark-frequency loudness performs with 32% equal error rate. Expanding the feature vector to include other common feature sets does not improve performance. Consistent with intuition, the optimal filter looks like an onset detector in the time domain.

Original languageEnglish (US)
Pages (from-to)78-85
Number of pages8
JournalPattern Recognition Letters
Volume38
Issue number1
DOIs
StatePublished - Mar 1 2014

Keywords

  • Auditory salience
  • Conference room
  • Detection
  • Nonlinear programming

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Automatic detection of auditory salience with optimized linear filters derived from human annotation'. Together they form a unique fingerprint.

Cite this