Locality preserving speaker clustering

Stephen M. Chu, Hao Tang, Thomas S. Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we propose an efficient speaker clustering approach based on a locality preserving linear projective mapping in the Gaussian mixture model (GMM) mean supervector space. While the GMM mean supervector has turned out to be an effective representation of speakers, its dimensionality is usually very high. The locality preserving projection (LPP) maps the high-dimensional GMM mean supervector space into a lower-dimensional subspace in an unsupervised fashion where the local neighborhood structure of the data points is optimally preserved. Our speaker clustering experiments clearly show that in the reduced-dimensional LPP subspace, traditional clustering techniques such as k-means and hierarchical clustering perform significantly better than they would in the original high-dimensional GMM mean supervector space and in its principal component subspace.

Original languageEnglish (US)
Title of host publicationProceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009
Pages494-497
Number of pages4
DOIs
StatePublished - 2009
Externally publishedYes
Event2009 IEEE International Conference on Multimedia and Expo, ICME 2009 - New York, NY, United States
Duration: Jun 28 2009Jul 3 2009

Publication series

NameProceedings - 2009 IEEE International Conference on Multimedia and Expo, ICME 2009

Other

Other2009 IEEE International Conference on Multimedia and Expo, ICME 2009
Country/TerritoryUnited States
CityNew York, NY
Period6/28/097/3/09

Keywords

  • Gaussian mixture model
  • Locality preserving projection
  • Mean supervector
  • Speaker clustering
  • Subspace

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Computer Networks and Communications
  • Hardware and Architecture
  • Software

Fingerprint

Dive into the research topics of 'Locality preserving speaker clustering'. Together they form a unique fingerprint.

Cite this