TY - GEN
T1 - Fishervoice and semi-supervised speaker clustering
AU - Chu, Stephen M.
AU - Tang, Hao
AU - Huang, Thomas S.
PY - 2009
Y1 - 2009
N2 - Speaker subspace modeling has become increasingly important in speaker recognition, diarization, and clustering. Principal component analysis (PCA) is a popular linear subspace learning technique and the approach that represents an arbitrary utterance or speaker as a linear combination of a set of basis voices based on PCA is known as the eigenvoice approach. In this paper, a novel technique, namely the fishervoice approach, is proposed. The fishervoice approach is based on linear discriminant analysis, another successful linear subspace learning technique that provides an optimized low-dimensional representation of utterances or speakers with focus on the most discriminative basis voices. We apply the fishervoice approach to speaker clustering in a semi-supervised manner and show that the fishervoice approach significantly outperforms the eigenvoice approach in all our experiments on the GALE Mandarin dataset.
AB - Speaker subspace modeling has become increasingly important in speaker recognition, diarization, and clustering. Principal component analysis (PCA) is a popular linear subspace learning technique and the approach that represents an arbitrary utterance or speaker as a linear combination of a set of basis voices based on PCA is known as the eigenvoice approach. In this paper, a novel technique, namely the fishervoice approach, is proposed. The fishervoice approach is based on linear discriminant analysis, another successful linear subspace learning technique that provides an optimized low-dimensional representation of utterances or speakers with focus on the most discriminative basis voices. We apply the fishervoice approach to speaker clustering in a semi-supervised manner and show that the fishervoice approach significantly outperforms the eigenvoice approach in all our experiments on the GALE Mandarin dataset.
KW - Eigenvoice
KW - Fisher-voice
KW - Linear subspace learning
KW - Semi-supervised speaker clustering
UR - http://www.scopus.com/inward/record.url?scp=70349194585&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349194585&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2009.4960527
DO - 10.1109/ICASSP.2009.4960527
M3 - Conference contribution
AN - SCOPUS:70349194585
SN - 9781424423545
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 4089
EP - 4092
BT - 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings, ICASSP 2009
T2 - 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009
Y2 - 19 April 2009 through 24 April 2009
ER -