Leveraging Metadata in No SQL Storage Systems

Ala Alkhaldi, Indranil Gupta, Vaijayanth Raghavan, Mainak Ghosh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

NoSQL systems have grown in popularity for storing big data because these systems offer high availability, i.e., Operations with high throughput and low latency. However, metadata in these systems are handled today in ad-hoc ways. We present Wasef, a system that treats metadata in a NoSQL database system, as first-class citizens. Metadata may include information such as: operational history for a database table (e.g., Columns), placement information for ranges of keys, and operational logs for data items (key-value pairs). Wasef allows the NoSQL system to store and query this metadata efficiently. We integrate Wasef into Apache Cassandra, one of the most popular key-value stores. We then implement three important use cases in Cassandra: dropping columns in a flexible manner, verifying data durability during migrational operations such as node decommissioning, and maintaining data provenance. Our experimental evaluation uses AWS EC2 instances and YCSB workloads. Our results show that Wasef: i) scales well with the size of the data and the metadata, ii) minimally affects throughput and operation latencies.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015
EditorsCalton Pu, Ajay Mohindra
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages57-64
Number of pages8
ISBN (Electronic)9781467372879
DOIs
StatePublished - Aug 19 2015
Event8th IEEE International Conference on Cloud Computing, CLOUD 2015 - New York, United States
Duration: Jun 27 2015Jul 2 2015

Publication series

NameProceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015

Other

Other8th IEEE International Conference on Cloud Computing, CLOUD 2015
CountryUnited States
CityNew York
Period6/27/157/2/15

Fingerprint

Metadata
Throughput
Durability
Availability

Keywords

  • Metadata
  • Nosql
  • Provenance

ASJC Scopus subject areas

  • Computer Networks and Communications

Cite this

Alkhaldi, A., Gupta, I., Raghavan, V., & Ghosh, M. (2015). Leveraging Metadata in No SQL Storage Systems. In C. Pu, & A. Mohindra (Eds.), Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015 (pp. 57-64). [7214028] (Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/CLOUD.2015.18

Leveraging Metadata in No SQL Storage Systems. / Alkhaldi, Ala; Gupta, Indranil; Raghavan, Vaijayanth; Ghosh, Mainak.

Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015. ed. / Calton Pu; Ajay Mohindra. Institute of Electrical and Electronics Engineers Inc., 2015. p. 57-64 7214028 (Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Alkhaldi, A, Gupta, I, Raghavan, V & Ghosh, M 2015, Leveraging Metadata in No SQL Storage Systems. in C Pu & A Mohindra (eds), Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015., 7214028, Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015, Institute of Electrical and Electronics Engineers Inc., pp. 57-64, 8th IEEE International Conference on Cloud Computing, CLOUD 2015, New York, United States, 6/27/15. https://doi.org/10.1109/CLOUD.2015.18
Alkhaldi A, Gupta I, Raghavan V, Ghosh M. Leveraging Metadata in No SQL Storage Systems. In Pu C, Mohindra A, editors, Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015. Institute of Electrical and Electronics Engineers Inc. 2015. p. 57-64. 7214028. (Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015). https://doi.org/10.1109/CLOUD.2015.18
Alkhaldi, Ala ; Gupta, Indranil ; Raghavan, Vaijayanth ; Ghosh, Mainak. / Leveraging Metadata in No SQL Storage Systems. Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015. editor / Calton Pu ; Ajay Mohindra. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 57-64 (Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015).
@inproceedings{b66b3badf7d349e697418f4e6ccfcfe7,
title = "Leveraging Metadata in No SQL Storage Systems",
abstract = "NoSQL systems have grown in popularity for storing big data because these systems offer high availability, i.e., Operations with high throughput and low latency. However, metadata in these systems are handled today in ad-hoc ways. We present Wasef, a system that treats metadata in a NoSQL database system, as first-class citizens. Metadata may include information such as: operational history for a database table (e.g., Columns), placement information for ranges of keys, and operational logs for data items (key-value pairs). Wasef allows the NoSQL system to store and query this metadata efficiently. We integrate Wasef into Apache Cassandra, one of the most popular key-value stores. We then implement three important use cases in Cassandra: dropping columns in a flexible manner, verifying data durability during migrational operations such as node decommissioning, and maintaining data provenance. Our experimental evaluation uses AWS EC2 instances and YCSB workloads. Our results show that Wasef: i) scales well with the size of the data and the metadata, ii) minimally affects throughput and operation latencies.",
keywords = "Metadata, Nosql, Provenance",
author = "Ala Alkhaldi and Indranil Gupta and Vaijayanth Raghavan and Mainak Ghosh",
year = "2015",
month = "8",
day = "19",
doi = "10.1109/CLOUD.2015.18",
language = "English (US)",
series = "Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "57--64",
editor = "Calton Pu and Ajay Mohindra",
booktitle = "Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015",
address = "United States",

}

TY - GEN

T1 - Leveraging Metadata in No SQL Storage Systems

AU - Alkhaldi, Ala

AU - Gupta, Indranil

AU - Raghavan, Vaijayanth

AU - Ghosh, Mainak

PY - 2015/8/19

Y1 - 2015/8/19

N2 - NoSQL systems have grown in popularity for storing big data because these systems offer high availability, i.e., Operations with high throughput and low latency. However, metadata in these systems are handled today in ad-hoc ways. We present Wasef, a system that treats metadata in a NoSQL database system, as first-class citizens. Metadata may include information such as: operational history for a database table (e.g., Columns), placement information for ranges of keys, and operational logs for data items (key-value pairs). Wasef allows the NoSQL system to store and query this metadata efficiently. We integrate Wasef into Apache Cassandra, one of the most popular key-value stores. We then implement three important use cases in Cassandra: dropping columns in a flexible manner, verifying data durability during migrational operations such as node decommissioning, and maintaining data provenance. Our experimental evaluation uses AWS EC2 instances and YCSB workloads. Our results show that Wasef: i) scales well with the size of the data and the metadata, ii) minimally affects throughput and operation latencies.

AB - NoSQL systems have grown in popularity for storing big data because these systems offer high availability, i.e., Operations with high throughput and low latency. However, metadata in these systems are handled today in ad-hoc ways. We present Wasef, a system that treats metadata in a NoSQL database system, as first-class citizens. Metadata may include information such as: operational history for a database table (e.g., Columns), placement information for ranges of keys, and operational logs for data items (key-value pairs). Wasef allows the NoSQL system to store and query this metadata efficiently. We integrate Wasef into Apache Cassandra, one of the most popular key-value stores. We then implement three important use cases in Cassandra: dropping columns in a flexible manner, verifying data durability during migrational operations such as node decommissioning, and maintaining data provenance. Our experimental evaluation uses AWS EC2 instances and YCSB workloads. Our results show that Wasef: i) scales well with the size of the data and the metadata, ii) minimally affects throughput and operation latencies.

KW - Metadata

KW - Nosql

KW - Provenance

UR - http://www.scopus.com/inward/record.url?scp=84960128251&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84960128251&partnerID=8YFLogxK

U2 - 10.1109/CLOUD.2015.18

DO - 10.1109/CLOUD.2015.18

M3 - Conference contribution

AN - SCOPUS:84960128251

T3 - Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015

SP - 57

EP - 64

BT - Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015

A2 - Pu, Calton

A2 - Mohindra, Ajay

PB - Institute of Electrical and Electronics Engineers Inc.

ER -