Extended minimum classification error training in voice activity detection

Takayuki Arakawa, Haitham Al-Hassanieh, Masanori Tsujikawa, Ryosuke Isotani

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Voice Activity Detection (VAD) is a fundamental part of speech processing. Combination of multiple acoustic features is an effective approach to make VAD more robust against various noise conditions. There have been proposed several feature combination methods, in which weights for feature values are optimized based on Minimum Classification Error (MCE) training. We improve these MCE-based methods by introducing a novel discriminative function for whole frames. The proposed method optimizes combination weights taking into account the ratio between false acceptance and false rejection rates as well as the effect of the use of shaping procedures such as hangover.

Original languageEnglish (US)
Title of host publicationProceedings of the 2009 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2009
Pages232-236
Number of pages5
DOIs
StatePublished - Dec 1 2009
Externally publishedYes
Event2009 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2009 - Merano, Italy
Duration: Dec 13 2009Dec 17 2009

Publication series

NameProceedings of the 2009 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2009

Other

Other2009 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2009
Country/TerritoryItaly
CityMerano
Period12/13/0912/17/09

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction
  • Signal Processing

Fingerprint

Dive into the research topics of 'Extended minimum classification error training in voice activity detection'. Together they form a unique fingerprint.

Cite this