A discriminative model for query spelling correction with latent structural SVM

Huizhong Duan, Yanen Li, Cheng Xiang Zhai, Dan Roth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Discriminative training in query spelling correction is difficult due to the complex internal structures of the data. Recent work on query spelling correction suggests a two stage approach a noisy channel model that is used to retrieve a number of candidate corrections, followed by discriminatively trained ranker applied to these candidates. The ranker, however, suffers from the fact the low recall of the first, suboptimal, search stage. This paper proposes to directly optimize the search stage with a discriminative model based on latent structural SVM. In this model, we treat query spelling correction as a multi-class classification problem with structured input and output. The latent structural information is used to model the alignment of words in the spelling correction process. Experiment results show that as a standalone speller, our model outperforms all the baseline systems. It also attains a higher recall compared with the noisy channel model, and can therefore serve as a better filtering stage when combined with a ranker.

Original languageEnglish (US)
Title of host publicationEMNLP-CoNLL 2012 - 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Proceedings of the Conference
Pages1511-1521
Number of pages11
StatePublished - Dec 1 2012
Event2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012 - Jeju Island, Korea, Republic of
Duration: Jul 12 2012Jul 14 2012

Publication series

NameEMNLP-CoNLL 2012 - 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Proceedings of the Conference

Other

Other2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012
CountryKorea, Republic of
CityJeju Island
Period7/12/127/14/12

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'A discriminative model for query spelling correction with latent structural SVM'. Together they form a unique fingerprint.

Cite this