Fast Compaction Algorithms for NoSQL Databases

Mainak Ghosh, Indranil Gupta, Shalmoli Gupta, Nirman Kumar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Compaction plays a crucial role in NoSQL systems to ensure a high overall read throughput. In this work, we formally define compaction as an optimization problem that attempts to minimize disk I/O. We prove this problem to be NPHard. We then propose a set of algorithms and mathematically analyze upper bounds on worst-case cost. We evaluate the proposed algorithms on real-life workloads. Our results show that our algorithms incur low I/O costs and that a compaction approach using a balanced tree is most preferable.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages452-461
Number of pages10
ISBN (Electronic)9781467372145
DOIs
StatePublished - Jul 22 2015
Event35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015 - Columbus, United States
Duration: Jun 29 2015Jul 2 2015

Publication series

NameProceedings - International Conference on Distributed Computing Systems
Volume2015-July

Other

Other35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015
Country/TerritoryUnited States
CityColumbus
Period6/29/157/2/15

Keywords

  • compaction
  • greedy approximation algorithm
  • nosql
  • np-hard

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Fast Compaction Algorithms for NoSQL Databases'. Together they form a unique fingerprint.

Cite this