This letter demonstrates hidden Markov model (HMM), multilayer perceptron (MLP), and time-delay recursive neural network (TDRNN) architectures for the purpose of recognizing pitch accents given observation of the F0 and energy trajectories. At an insertion error rate of 25%, the deletion error rates of the MLP, TDRNN, and HMM are 13.2%, 7.9%, and 32.7%, respectively, despite the fact that both MLP and TDRNN have 70% fewer trainable parameters than the HMM. Error analysis suggests that low-pitch accents may require long-term context to correctly recognize, while high-pitch accents may be recognizable based on local pitch contour.
ASJC Scopus subject areas
- Signal Processing
- Electrical and Electronic Engineering
- Applied Mathematics