TY - GEN
T1 - Leveraging Metadata in No SQL Storage Systems
AU - Alkhaldi, Ala
AU - Gupta, Indranil
AU - Raghavan, Vaijayanth
AU - Ghosh, Mainak
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/8/19
Y1 - 2015/8/19
N2 - NoSQL systems have grown in popularity for storing big data because these systems offer high availability, i.e., Operations with high throughput and low latency. However, metadata in these systems are handled today in ad-hoc ways. We present Wasef, a system that treats metadata in a NoSQL database system, as first-class citizens. Metadata may include information such as: operational history for a database table (e.g., Columns), placement information for ranges of keys, and operational logs for data items (key-value pairs). Wasef allows the NoSQL system to store and query this metadata efficiently. We integrate Wasef into Apache Cassandra, one of the most popular key-value stores. We then implement three important use cases in Cassandra: dropping columns in a flexible manner, verifying data durability during migrational operations such as node decommissioning, and maintaining data provenance. Our experimental evaluation uses AWS EC2 instances and YCSB workloads. Our results show that Wasef: i) scales well with the size of the data and the metadata, ii) minimally affects throughput and operation latencies.
AB - NoSQL systems have grown in popularity for storing big data because these systems offer high availability, i.e., Operations with high throughput and low latency. However, metadata in these systems are handled today in ad-hoc ways. We present Wasef, a system that treats metadata in a NoSQL database system, as first-class citizens. Metadata may include information such as: operational history for a database table (e.g., Columns), placement information for ranges of keys, and operational logs for data items (key-value pairs). Wasef allows the NoSQL system to store and query this metadata efficiently. We integrate Wasef into Apache Cassandra, one of the most popular key-value stores. We then implement three important use cases in Cassandra: dropping columns in a flexible manner, verifying data durability during migrational operations such as node decommissioning, and maintaining data provenance. Our experimental evaluation uses AWS EC2 instances and YCSB workloads. Our results show that Wasef: i) scales well with the size of the data and the metadata, ii) minimally affects throughput and operation latencies.
KW - Metadata
KW - Nosql
KW - Provenance
UR - http://www.scopus.com/inward/record.url?scp=84960128251&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84960128251&partnerID=8YFLogxK
U2 - 10.1109/CLOUD.2015.18
DO - 10.1109/CLOUD.2015.18
M3 - Conference contribution
AN - SCOPUS:84960128251
T3 - Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015
SP - 57
EP - 64
BT - Proceedings - 2015 IEEE 8th International Conference on Cloud Computing, CLOUD 2015
A2 - Pu, Calton
A2 - Mohindra, Ajay
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 8th IEEE International Conference on Cloud Computing, CLOUD 2015
Y2 - 27 June 2015 through 2 July 2015
ER -