Abstract
This paper presents a semi-latent topic model for semantic domain detection in spoken language understanding systems. We use labeled utterance information to capture latent topics, which directly correspond to semantic domains. Additionally, we introduce an 'informative prior' for Bayesian inference that can simultaneously segment utterances of known domains into classes and divide them from out-of-domain utterances. We show that our model generalizes well on the task of classifying spoken language utterances and compare its results to those of an unsupervised topic model, which does not use labeled information.
Original language | English (US) |
---|---|
Pages (from-to) | 713-716 |
Number of pages | 4 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
State | Published - 2011 |
Externally published | Yes |
Event | 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy Duration: Aug 27 2011 → Aug 31 2011 |
Keywords
- Generative models
- Gibbs sampling
- Spoken language understanding
ASJC Scopus subject areas
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modeling and Simulation