Robust speaker identification using auditory features and computational auditory scene analysis

Yang Shao, De Liang Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The performance of speaker recognition systems drop significantly under noisy conditions. To improve robustness, we have recently proposed novel auditory features and a robust speaker recognition system using a front-end based on computational auditory scene analysis. In this paper, we further study the auditory features by exploring different feature dimensions and incorporating dynamic features. In addition, we evaluate the features and robust recognition in a speaker identification task in a number of noisy conditions. We find that one of the auditory features performs substantially better than a conventional speaker feature. Furthermore, our recognition system achieves significant performance improvements compared with an advanced front-end in a wide range of signal-to-noise conditions.

Original languageEnglish (US)
Title of host publication2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
Pages1589-1592
Number of pages4
DOIs
StatePublished - 2008
Externally publishedYes
Event2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP - Las Vegas, NV, United States
Duration: Mar 31 2008Apr 4 2008

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Other

Other2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
Country/TerritoryUnited States
CityLas Vegas, NV
Period3/31/084/4/08

Keywords

  • Auditory feature
  • Computational auditory scene analysis
  • Gammatone feature
  • Gammatone frequency cepstral coefficient
  • Robust speaker recognition

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Robust speaker identification using auditory features and computational auditory scene analysis'. Together they form a unique fingerprint.

Cite this