Storage-efficient data replica number computation for multi-level priority data in distributed storage systems

Chris X. Cai, Cristina L. Abad, Roy H. Campbell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Distributed storage systems often use replication for improved availability, performance and scalability. In this paper, we consider the case of using file replication to improve the availability of different classes of files, where some classes are more 'important' than others and more replicas are created for them to achieve improved availability. The question we attempt to answer is: given a fixed storage budget for storing replicas, what is the number of replicas of each file class to create to maximize the (weighted) overall availability of files? We present our work towards a replica number computation algorithm that takes into account a storage budget, a configurable maximum expected percentage of failed nodes, and weights for different file classes. Simulation results show that our algorithm is able to improve the availability of the prioritized files with higher weights, has a low computation time and can utilize storage space efficiently when total storage space scales to a large size.

Original languageEnglish (US)
Title of host publication2013 43rd Annual IEEE/IFIP Conference on Dependable Systems and Networks Workshop, DSN-W 2013
PublisherIEEE Computer Society
ISBN (Print)9781479901814
DOIs
StatePublished - Jan 1 2013
Event2013 43rd Annual IEEE/IFIP Conference on Dependable Systems and Networks Workshop, DSN-W 2013 - Budapest, Hungary
Duration: Jun 24 2013Jun 27 2013

Publication series

NameProceedings of the International Conference on Dependable Systems and Networks

Other

Other2013 43rd Annual IEEE/IFIP Conference on Dependable Systems and Networks Workshop, DSN-W 2013
CountryHungary
CityBudapest
Period6/24/136/27/13

Fingerprint

Availability
Scalability

Keywords

  • availability
  • budget
  • optimization
  • replication
  • storage system

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Cai, C. X., Abad, C. L., & Campbell, R. H. (2013). Storage-efficient data replica number computation for multi-level priority data in distributed storage systems. In 2013 43rd Annual IEEE/IFIP Conference on Dependable Systems and Networks Workshop, DSN-W 2013 [6615512] (Proceedings of the International Conference on Dependable Systems and Networks). IEEE Computer Society. https://doi.org/10.1109/DSNW.2013.6615512

Storage-efficient data replica number computation for multi-level priority data in distributed storage systems. / Cai, Chris X.; Abad, Cristina L.; Campbell, Roy H.

2013 43rd Annual IEEE/IFIP Conference on Dependable Systems and Networks Workshop, DSN-W 2013. IEEE Computer Society, 2013. 6615512 (Proceedings of the International Conference on Dependable Systems and Networks).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cai, CX, Abad, CL & Campbell, RH 2013, Storage-efficient data replica number computation for multi-level priority data in distributed storage systems. in 2013 43rd Annual IEEE/IFIP Conference on Dependable Systems and Networks Workshop, DSN-W 2013., 6615512, Proceedings of the International Conference on Dependable Systems and Networks, IEEE Computer Society, 2013 43rd Annual IEEE/IFIP Conference on Dependable Systems and Networks Workshop, DSN-W 2013, Budapest, Hungary, 6/24/13. https://doi.org/10.1109/DSNW.2013.6615512
Cai CX, Abad CL, Campbell RH. Storage-efficient data replica number computation for multi-level priority data in distributed storage systems. In 2013 43rd Annual IEEE/IFIP Conference on Dependable Systems and Networks Workshop, DSN-W 2013. IEEE Computer Society. 2013. 6615512. (Proceedings of the International Conference on Dependable Systems and Networks). https://doi.org/10.1109/DSNW.2013.6615512
Cai, Chris X. ; Abad, Cristina L. ; Campbell, Roy H. / Storage-efficient data replica number computation for multi-level priority data in distributed storage systems. 2013 43rd Annual IEEE/IFIP Conference on Dependable Systems and Networks Workshop, DSN-W 2013. IEEE Computer Society, 2013. (Proceedings of the International Conference on Dependable Systems and Networks).
@inproceedings{abb2fcf0e0dd46cf87563d1317e4b01b,
title = "Storage-efficient data replica number computation for multi-level priority data in distributed storage systems",
abstract = "Distributed storage systems often use replication for improved availability, performance and scalability. In this paper, we consider the case of using file replication to improve the availability of different classes of files, where some classes are more 'important' than others and more replicas are created for them to achieve improved availability. The question we attempt to answer is: given a fixed storage budget for storing replicas, what is the number of replicas of each file class to create to maximize the (weighted) overall availability of files? We present our work towards a replica number computation algorithm that takes into account a storage budget, a configurable maximum expected percentage of failed nodes, and weights for different file classes. Simulation results show that our algorithm is able to improve the availability of the prioritized files with higher weights, has a low computation time and can utilize storage space efficiently when total storage space scales to a large size.",
keywords = "availability, budget, optimization, replication, storage system",
author = "Cai, {Chris X.} and Abad, {Cristina L.} and Campbell, {Roy H.}",
year = "2013",
month = "1",
day = "1",
doi = "10.1109/DSNW.2013.6615512",
language = "English (US)",
isbn = "9781479901814",
series = "Proceedings of the International Conference on Dependable Systems and Networks",
publisher = "IEEE Computer Society",
booktitle = "2013 43rd Annual IEEE/IFIP Conference on Dependable Systems and Networks Workshop, DSN-W 2013",

}

TY - GEN

T1 - Storage-efficient data replica number computation for multi-level priority data in distributed storage systems

AU - Cai, Chris X.

AU - Abad, Cristina L.

AU - Campbell, Roy H.

PY - 2013/1/1

Y1 - 2013/1/1

N2 - Distributed storage systems often use replication for improved availability, performance and scalability. In this paper, we consider the case of using file replication to improve the availability of different classes of files, where some classes are more 'important' than others and more replicas are created for them to achieve improved availability. The question we attempt to answer is: given a fixed storage budget for storing replicas, what is the number of replicas of each file class to create to maximize the (weighted) overall availability of files? We present our work towards a replica number computation algorithm that takes into account a storage budget, a configurable maximum expected percentage of failed nodes, and weights for different file classes. Simulation results show that our algorithm is able to improve the availability of the prioritized files with higher weights, has a low computation time and can utilize storage space efficiently when total storage space scales to a large size.

AB - Distributed storage systems often use replication for improved availability, performance and scalability. In this paper, we consider the case of using file replication to improve the availability of different classes of files, where some classes are more 'important' than others and more replicas are created for them to achieve improved availability. The question we attempt to answer is: given a fixed storage budget for storing replicas, what is the number of replicas of each file class to create to maximize the (weighted) overall availability of files? We present our work towards a replica number computation algorithm that takes into account a storage budget, a configurable maximum expected percentage of failed nodes, and weights for different file classes. Simulation results show that our algorithm is able to improve the availability of the prioritized files with higher weights, has a low computation time and can utilize storage space efficiently when total storage space scales to a large size.

KW - availability

KW - budget

KW - optimization

KW - replication

KW - storage system

UR - http://www.scopus.com/inward/record.url?scp=84886017223&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84886017223&partnerID=8YFLogxK

U2 - 10.1109/DSNW.2013.6615512

DO - 10.1109/DSNW.2013.6615512

M3 - Conference contribution

AN - SCOPUS:84886017223

SN - 9781479901814

T3 - Proceedings of the International Conference on Dependable Systems and Networks

BT - 2013 43rd Annual IEEE/IFIP Conference on Dependable Systems and Networks Workshop, DSN-W 2013

PB - IEEE Computer Society

ER -