Incorporating auditory feature uncertainties in robust speaker identification

Yang Shao, Soundararajan Srinivasan, Deliang Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Conventional speaker recognition systems perform poorly under noisy conditions. Recent research suggests that binary time-frequency (T-F) masks be a promising front-end for robust speaker recognition. In this paper, we propose novel auditory features based on an auditory periphery model, and show that these features capture significant speaker characteristics. Additionally, we estimate uncertainties of the auditory features based on binary T-F masks, and calculate speaker likelihood scores using uncertainty decoding. Our approach achieves substantial performance improvement in a speaker identification task compared with a state-of-the-art robust front-end in a wide range of signal-to-noise conditions.

Original languageEnglish (US)
Title of host publication2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07
PagesIV277-IV280
DOIs
StatePublished - Aug 6 2007
Externally publishedYes
Event2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07 - Honolulu, HI, United States
Duration: Apr 15 2007Apr 20 2007

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume4
ISSN (Print)1520-6149

Other

Other2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07
CountryUnited States
CityHonolulu, HI
Period4/15/074/20/07

Keywords

  • Auditory features
  • Robust speaker identification
  • Uncertainty decoding

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Incorporating auditory feature uncertainties in robust speaker identification'. Together they form a unique fingerprint.

  • Cite this

    Shao, Y., Srinivasan, S., & Wang, D. (2007). Incorporating auditory feature uncertainties in robust speaker identification. In 2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07 (pp. IV277-IV280). [4218091] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 4). https://doi.org/10.1109/ICASSP.2007.366903