CloudSpeller: Query spelling correction by using a unified Hidden Markov Model with Web-scale resources

Yanen Li, Huizhong Duan, Cheng Xiang Zhai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Query spelling correction is an important component of modern search engines that can help users to express an information need more accurately and thus improve search quality. In this work we proposed and implemented an end-to-end speller correction system, namely CloudSpeller. The CloudSpeller system uses a Hidden Markov Model to effectively model major types of spelling errors in a unified framework, in which we integrate a large-scale lexicon constructed usingWikipedia, an error model trained from high confidence correction pairs, and the Microsoft Web N-gram service. Our system achieves excellent performance on two search query spelling correction datasets, reaching 0.960 and 0.937 F1 scores on the TREC dataset and the MSN dataset respectively. Copyright is held by the author/owner(s).

Original languageEnglish (US)
Title of host publicationWWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion
Pages567-568
Number of pages2
DOIs
StatePublished - 2012
Event21st Annual Conference on World Wide Web, WWW'12 - Lyon, France
Duration: Apr 16 2012Apr 20 2012

Publication series

NameWWW'12 - Proceedings of the 21st Annual Conference on World Wide Web Companion

Other

Other21st Annual Conference on World Wide Web, WWW'12
Country/TerritoryFrance
CityLyon
Period4/16/124/20/12

Keywords

  • CloudSpeller
  • Query spelling correction

ASJC Scopus subject areas

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'CloudSpeller: Query spelling correction by using a unified Hidden Markov Model with Web-scale resources'. Together they form a unique fingerprint.

Cite this