Trustworthy keyword search for compliance storage

Soumyadeb Mitra, Marianne Winslett, Windsor W. Hsu, Kevin Chen Chuan Chang

Research output: Contribution to journalArticlepeer-review


Intense regulatory focus on secure retention of electronic records has led to a need to ensure that records are trustworthy, i.e., able to provide irrefutable proof and accurate details of past events. In this paper, we analyze the requirements for a trustworthy index to support keyword-based search queries. We argue that trustworthy index entries must be durable-the index must be updated when new documents arrive, and not periodically deleted and rebuilt. To this end, we propose a scheme for efficiently updating an inverted index, based on judicious merging of the posting lists of terms. Through extensive simulations and experiments with two real world data sets and workloads, we demonstrate that the scheme achieves online update speed while maintaining good query performance. We also present and evaluate jump indexes, a novel trustworthy and efficient index for join operations on posting lists for multi-keyword queries. Jump indexes support insert, lookup and range queries in time logarithmic in the number of indexed documents.

Original languageEnglish (US)
Pages (from-to)225-242
Number of pages18
JournalVLDB Journal
Issue number2
StatePublished - Mar 2008


  • Compliance storage
  • Inverted index
  • Jump index

ASJC Scopus subject areas

  • Information Systems
  • Hardware and Architecture


Dive into the research topics of 'Trustworthy keyword search for compliance storage'. Together they form a unique fingerprint.

Cite this