TY - GEN
T1 - A workflow-aware storage system
T2 - 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2012
AU - Vairavanathan, Emalayan
AU - Al-Kiswany, Samer
AU - Costa, Lauro Beltrão
AU - Zhang, Zhao
AU - Katz, Daniel S.
AU - Wilde, Michael
AU - Ripeanu, Matei
N1 - Copyright:
Copyright 2012 Elsevier B.V., All rights reserved.
PY - 2012
Y1 - 2012
N2 - This paper evaluates the potential gains a workflow-aware storage system can bring. Two observations make us believe such storage system is crucial to efficiently support workflow-based applications: First, workflows generate irregular and application-dependent data access patterns. These patterns render existing storage systems unable to harness all optimization opportunities as this often requires conflicting optimization options or even conflicting design decision at the level of the storage system. Second, when scheduling, workflow runtime engines make suboptimal decisions as they lack detailed data location information. This paper discusses the feasibility, and evaluates the potential performance benefits brought by, building a workflow-aware storage system that supports per-file access optimizations and exposes data location. To this end, this paper presents approaches to determine the application-specific data access patterns, and evaluates experimentally the performance gains of a workflow-aware storage approach. Our evaluation using synthetic benchmarks shows that a workflow-aware storage system can bring significant performance gains: up to 7x performance gain compared to the distributed storage system - MosaStore and up to 16x compared to a central, well provisioned, NFS server.
AB - This paper evaluates the potential gains a workflow-aware storage system can bring. Two observations make us believe such storage system is crucial to efficiently support workflow-based applications: First, workflows generate irregular and application-dependent data access patterns. These patterns render existing storage systems unable to harness all optimization opportunities as this often requires conflicting optimization options or even conflicting design decision at the level of the storage system. Second, when scheduling, workflow runtime engines make suboptimal decisions as they lack detailed data location information. This paper discusses the feasibility, and evaluates the potential performance benefits brought by, building a workflow-aware storage system that supports per-file access optimizations and exposes data location. To this end, this paper presents approaches to determine the application-specific data access patterns, and evaluates experimentally the performance gains of a workflow-aware storage approach. Our evaluation using synthetic benchmarks shows that a workflow-aware storage system can bring significant performance gains: up to 7x performance gain compared to the distributed storage system - MosaStore and up to 16x compared to a central, well provisioned, NFS server.
KW - Large-scale storage systems
KW - workflow optimizations
KW - workflow-aware storage systems
UR - http://www.scopus.com/inward/record.url?scp=84863702050&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84863702050&partnerID=8YFLogxK
U2 - 10.1109/CCGrid.2012.109
DO - 10.1109/CCGrid.2012.109
M3 - Conference contribution
AN - SCOPUS:84863702050
SN - 9780769546919
T3 - Proceedings - 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2012
SP - 326
EP - 334
BT - Proceedings - 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2012
Y2 - 13 May 2012 through 16 May 2012
ER -