Abstract
A challenge in large vocabulary spoken language understanding (SLU) is robustness to automatic speech recognition (ASR) errors. The state of the art approaches for semantic parsing rely on using discriminative sequence classification methods, such as conditional random fields (CRFs). Most dialog systems employ a cascaded approach where the best hypotheses from the ASR system are fed into the following SLU system. In our previous work, we have proposed the use of lattices towards joint recognition and parsing. In this paper, extending this idea, we propose to exploit word confusion networks (WCNs), compiled from ASR lattices for both CRF modeling and decoding. WCNs provide a compact representation of multiple aligned ASR hypotheses, without compromising recognition accuracy. For slot filling, we show significant semantic parsing performance improvements using WCNs compared to ASR 1-best output, approximating the oracle path performance.
Original language | English (US) |
---|---|
Pages (from-to) | 2579-2583 |
Number of pages | 5 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
DOIs | |
State | Published - 2013 |
Externally published | Yes |
Event | 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013 - Lyon, France Duration: Aug 25 2013 → Aug 29 2013 |
Keywords
- Conditional random field
- Natural language understanding
- Semantic parsing
- Word confusion network
ASJC Scopus subject areas
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modeling and Simulation