Abstract
Segmenting the acoustic signal in the TIMIT database by a switching state Kalman filter model is reported in this paper. According to the assumption that the high dimensional acoustic feature vector of the LSF (Line Spectrum Frequency) of the speech signal is probably embedded in a low dimensional space, a two dimensional vector is used to represent the continuous state vector in this model. The parameters of the model are initialized by PPCA (probabilistic principle component analysis) and first order vector auto-regression, and are re-estimated by the EM algorithm. We show that this model can be used to classify vowels, nasals, frication and silence by an approximate Viterbi inference.
Original language | English (US) |
---|---|
Pages (from-to) | 752-755 |
Number of pages | 4 |
Journal | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
Volume | 1 |
State | Published - 2003 |
Event | 2003 IEEE International Conference on Accoustics, Speech, and Signal Processing - Hong Kong, Hong Kong Duration: Apr 6 2003 → Apr 10 2003 |
ASJC Scopus subject areas
- Software
- Signal Processing
- Electrical and Electronic Engineering