Bayesian classifiers rely on models of the a priori and class-conditional feature distributions; the classifier is trained by optimizing these models to best represent features observed in a training corpus according to certain criterion. In many problems of interest, the true class-conditional feature probability density function (PDF) is not a member of the set of PDFs the classifier can represent. Previous research has shown that the effect of this problem may be reduced either by improving the models or by transforming the features used in the classifier. This paper addresses this model mismatch problem in statistical identification, classification, and recognition systems. We formulate the problem as the problem of minimizing the relative entropy, which is also known as the Kullback-Leibler distance, between the true conditional PDF and the hypothesized probabilistic model. Based on this formulation, we provide a computationally efficient solution to the problem based on volume-preserving maps; existing linear transform designs are shown to be special cases of the proposed solution. Using this result, we propose the symplectic maximum likelihood transform (SMLT), which is a nonlinear volume-preserving extension of the maximum likelihood linear transform (MLLT). This approach has many applications in statistical modeling, classification, and recognition. We apply it to the maximum likelihood estimation (MLE) of the joint PDF of order statistics and show a significant increase in the likelihood for the same number of parameters. We provide also phoneme recognition experiments that show recognition accuracy improvement compared with using the baseline Mel-Frequency Cepstrum Coefficient (MFCC) features or using MLLT. We present an iterative algorithm to jointly estimate the parameters of the symplectic map and the probabilistic model for both applications.
ASJC Scopus subject areas
- Signal Processing
- Electrical and Electronic Engineering