Exploring discriminative learning for text-independent speaker recognition

Ming Liu, Zhengyou Zhang, Mark Hasegawa-Johnson, Thomas S. Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Speaker verification is a technology of verifying the claimed identity of a speaker based on the speech signal from the speaker (voice print). To learn the score of similarity between each pair of target and trial utterances, we investigated two different discriminative learning frameworks: fisher mapping followed by SVM learning and utterance transform followed by Iterative Cohort Modeling (ICM). In both methods, a mapping is applied to map speech utterance from a variable-length acoustic feature sequence into a fixed dimensional vector. SVM learning constructs a classifier in the mapped vector space for speaker verification. ICM learns a metric in this vector space by incorporating discriminative learning methods. The obtained metric is then used by a Nearest Neighbor classifier for speaker verification. The experiments conducted on NIST02 corpus show that both discriminative learning methods outperform the base-line GMM-UBM system. Furthermore, we observe that the ICM-based method is more effective than the SVM-based method, indicating that the metric learning scheme is more powerful in constructing a better metric in the mapped vector space.

Original languageEnglish (US)
Title of host publicationProceedings of the 2007 IEEE International Conference on Multimedia and Expo, ICME 2007
PublisherIEEE Computer Society
Pages56-59
Number of pages4
ISBN (Print)1424410177, 9781424410170
DOIs
StatePublished - 2007
EventIEEE International Conference onMultimedia and Expo, ICME 2007 - Beijing, China
Duration: Jul 2 2007Jul 5 2007

Publication series

NameProceedings of the 2007 IEEE International Conference on Multimedia and Expo, ICME 2007

Other

OtherIEEE International Conference onMultimedia and Expo, ICME 2007
Country/TerritoryChina
CityBeijing
Period7/2/077/5/07

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Software

Fingerprint

Dive into the research topics of 'Exploring discriminative learning for text-independent speaker recognition'. Together they form a unique fingerprint.

Cite this