Fast Compaction Algorithms for NoSQL Databases

Mainak Ghosh, Indranil Gupta, Shalmoli Gupta, Nirman Kumar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Compaction plays a crucial role in NoSQL systems to ensure a high overall read throughput. In this work, we formally define compaction as an optimization problem that attempts to minimize disk I/O. We prove this problem to be NPHard. We then propose a set of algorithms and mathematically analyze upper bounds on worst-case cost. We evaluate the proposed algorithms on real-life workloads. Our results show that our algorithms incur low I/O costs and that a compaction approach using a balanced tree is most preferable.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages452-461
Number of pages10
ISBN (Electronic)9781467372145
DOIs
StatePublished - Jul 22 2015
Event35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015 - Columbus, United States
Duration: Jun 29 2015Jul 2 2015

Publication series

NameProceedings - International Conference on Distributed Computing Systems
Volume2015-July

Other

Other35th IEEE International Conference on Distributed Computing Systems, ICDCS 2015
CountryUnited States
CityColumbus
Period6/29/157/2/15

Keywords

  • compaction
  • greedy approximation algorithm
  • nosql
  • np-hard

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Fast Compaction Algorithms for NoSQL Databases'. Together they form a unique fingerprint.

  • Cite this

    Ghosh, M., Gupta, I., Gupta, S., & Kumar, N. (2015). Fast Compaction Algorithms for NoSQL Databases. In Proceedings - 2015 IEEE 35th International Conference on Distributed Computing Systems, ICDCS 2015 (pp. 452-461). [7164931] (Proceedings - International Conference on Distributed Computing Systems; Vol. 2015-July). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDCS.2015.53