Fast string dictionary lookup with one error

Timothy Chan, Moshe Lewenstein

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A set of strings, called a string dictionary, is a basic string data structure. The most primitive query, where one seeks the existence of a pattern in the dictionary, is called a lookup query. Approximate lookup queries, i.e., to lookup the existence of a pattern with a bounded number of errors, is a fundamental string problem. Several data structures have been proposed to do so efficiently. Almost all solutions consider a single error, as will this result. Lately, Belazzougui and Venturini (CPM 2013) raised the question whether one can construct efficient indexes that support lookup queries with one error in optimal query time, that is, O(|p|/ω + occ), where p is the query, ω the machine word-size, and occ the number of occurrences. Specifically, for the problem of one mismatch and constant alphabet size, we obtain optimal query time. For a dictionary of d strings our proposed index uses O(ωd log1+ε d) additional bit space (beyond the space required to access the dictionary data, which can be maintained in compressed form). Our results are parameterized for a space-time tradeoff. We propose more results for the case of lookup queries with one insertion/ deletion on dictionaries over a constant sized alphabet. These results are especially effective for large patterns.

Original languageEnglish (US)
Title of host publicationCombinatorial Pattern Matching - 26th Annual Symposium, CPM 2015, Proceedings
EditorsUgo Vaccaro, Ely Porat, Ferdinando Cicalese
PublisherSpringer-Verlag Berlin Heidelberg
Pages114-123
Number of pages10
ISBN (Print)9783319199283
DOIs
StatePublished - 2015
Externally publishedYes
Event26th Annual Symposium on Combinatorial Pattern Matching, CPM 2015 - Ischia Island, Italy
Duration: Jun 29 2015Jul 1 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9133
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other26th Annual Symposium on Combinatorial Pattern Matching, CPM 2015
CountryItaly
CityIschia Island
Period6/29/157/1/15

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Fast string dictionary lookup with one error'. Together they form a unique fingerprint.

Cite this