A widely accepted linguistic theory holds that speech recognition in humans proceeds from an intermediate representation of the acoustic signal in terms of a small number of phonetic symbols. A novel speech recognition system based on this theory in which the acoustic-to-phonetic mapping is accomplished by means of a particular form of hidden Markov model and is independent of lexical and syntactic constraint is described. Word recognition is then treated as a classical string-to-string editing problem which is solved with a two-level dynamic programming algorithm that accounts for lexical and syntactic structure. The system was tested on speaker-independent recognition of fluent speech from the 991-word DARPA resource management task, on which 76.6% word accuracy was achieved. In informal tests it was observed that the phonetic transcription can be resynthesized to provide a 100-b/s vocoder with word intelligibly rates of approximately 75%.
|Original language||English (US)|
|Number of pages||4|
|Journal||ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings|
|State||Published - 1990|
ASJC Scopus subject areas
- Signal Processing
- Electrical and Electronic Engineering