TY - GEN
T1 - Strong-sense class-dependent features for statistical recognition
AU - Omar, M. K.
AU - Hasegawa-Johnson, M.
N1 - Publisher Copyright:
© 2003 IEEE.
PY - 2003
Y1 - 2003
N2 - In statistical classification and recognition problems with many classes, it is commonly the case that different classes exhibit wildly different properties. In this case it is unreasonable to expect to be able to summarize these properties by using features designed to represent all the classes. In contrast, features should be designed to represent subsets that exhibit common properties without regard to any class outside this subset. The value of these features for classes outside the subset may be meaningless, or simply undefined. The main problem, due to the statistical nature of the recognizer, is how to compare likelihoods conditioned on different sets of features to decode an input pattern. This paper introduces a class-dependent feature design approach that can be integrated with any probabilistic model. This approach avoids the need of having a conditional probabilistic model for each class and feature type pair, and therefore decreases the computational and storage requirements of using heterogeneous features. This paper presents an algorithm to calculate the class-dependent features that minimize an estimate of the relative entropy between the conditional probabilistic model and the actual conditional probability density function (PDF) of the features of each class. An approach to a hidden Markov model (HMM) automatic speech recognition (ASR) system is applied. A nonlinear class-dependent volume-preserving transformation of the features is used to minimize the objective function. Using this approach, 2% improvement in phoneme recognition accuracy is achieved compared to the baseline system. The approach also shows improvement in recognition accuracy compared to previous class-dependent linear features transformation.
AB - In statistical classification and recognition problems with many classes, it is commonly the case that different classes exhibit wildly different properties. In this case it is unreasonable to expect to be able to summarize these properties by using features designed to represent all the classes. In contrast, features should be designed to represent subsets that exhibit common properties without regard to any class outside this subset. The value of these features for classes outside the subset may be meaningless, or simply undefined. The main problem, due to the statistical nature of the recognizer, is how to compare likelihoods conditioned on different sets of features to decode an input pattern. This paper introduces a class-dependent feature design approach that can be integrated with any probabilistic model. This approach avoids the need of having a conditional probabilistic model for each class and feature type pair, and therefore decreases the computational and storage requirements of using heterogeneous features. This paper presents an algorithm to calculate the class-dependent features that minimize an estimate of the relative entropy between the conditional probabilistic model and the actual conditional probability density function (PDF) of the features of each class. An approach to a hidden Markov model (HMM) automatic speech recognition (ASR) system is applied. A nonlinear class-dependent volume-preserving transformation of the features is used to minimize the objective function. Using this approach, 2% improvement in phoneme recognition accuracy is achieved compared to the baseline system. The approach also shows improvement in recognition accuracy compared to previous class-dependent linear features transformation.
KW - Automatic speech recognition
KW - Decoding
KW - Entropy
KW - Hidden Markov models
KW - Pattern recognition
KW - Performance loss
KW - Probability density function
KW - Robustness
KW - Speech recognition
UR - http://www.scopus.com/inward/record.url?scp=84863769386&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84863769386&partnerID=8YFLogxK
U2 - 10.1109/SSP.2003.1289454
DO - 10.1109/SSP.2003.1289454
M3 - Conference contribution
AN - SCOPUS:84863769386
T3 - IEEE Workshop on Statistical Signal Processing Proceedings
SP - 490
EP - 493
BT - Proceedings of the 2003 IEEE Workshop on Statistical Signal Processing, SSP 2003
PB - IEEE Computer Society
T2 - IEEE Workshop on Statistical Signal Processing, SSP 2003
Y2 - 28 September 2003 through 1 October 2003
ER -