Abstract
In automatic speech recognition, word graphs (lattices) are commonly used as an approximate representation of the complete word search space. Usually these word lattices are acyclic and have no a-priori structure. More recently a new class of normalized word lattices have been proposed. These word lattices (a.k.a. sausages) are very efficient (space) and they provide a normalization (chunking) of the lattice, by aligning words from all possible hypotheses. In this paper we propose a general framework for lattice chunking, the pivot algorithm. There are four important components of the pivot algorithm. First, the time information is not necessary but is beneficial for the overall performance. Second, the algorithm allows the definition of a predefined chunk structure of the final word lattice. Third, the algorithm operates on both weighted and unweighted lattices. Fourth, the labels on the graph are generic, and could be words as well as part of speech tags or parse tags. While the algorithm has applications to many tasks (e.g. parsing, named entity extraction) we present results on the performance of confidence scores for different large vocabulary speech recognition tasks. We compare the results of our algorithms against off-the-shelf methods and show significant improvements.
Original language | English (US) |
---|---|
Pages (from-to) | 596-599 |
Number of pages | 4 |
Journal | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
Volume | 1 |
State | Published - 2003 |
Externally published | Yes |
Event | 2003 IEEE International Conference on Accoustics, Speech, and Signal Processing - Hong Kong, Hong Kong Duration: Apr 6 2003 → Apr 10 2003 |
ASJC Scopus subject areas
- Software
- Signal Processing
- Electrical and Electronic Engineering