TY - GEN
T1 - New worker-centric scheduling strategies for data-intensive grid applications
AU - Ko, Steven Y.
AU - Morales, Ramsés
AU - Gupta, Indranil
PY - 2007
Y1 - 2007
N2 - Distributed computations, dealing with large amounts of data, are scheduled in Grid clusters today using either a task-centric mechanism, or a worker-centric mechanism. Because of the large data sets, the execution time is bounded by the cost of data transfer. In this paper, we introduce new worker-centric scheduling strategies that are novel in that they aim to implicitly exploit the locality of interest in order to reduce the cost of data transfer. Many Grid applications are characterized by such a locality of interest, i.e., a file is often accessed by multiple tasks and, more importantly, a set of files that are accessed by one task are also likely to be accessed together by other tasks. Our new deterministic, as well as probabilistic, scheduling algorithms implicitly exploit this feature to improve running time. Our experiments are done with traces of a real Grid application ( Coadd), and show that our algorithms are able to achieve utilization of over 90%, while reducing makespan significantly compared to task-centric approaches.
AB - Distributed computations, dealing with large amounts of data, are scheduled in Grid clusters today using either a task-centric mechanism, or a worker-centric mechanism. Because of the large data sets, the execution time is bounded by the cost of data transfer. In this paper, we introduce new worker-centric scheduling strategies that are novel in that they aim to implicitly exploit the locality of interest in order to reduce the cost of data transfer. Many Grid applications are characterized by such a locality of interest, i.e., a file is often accessed by multiple tasks and, more importantly, a set of files that are accessed by one task are also likely to be accessed together by other tasks. Our new deterministic, as well as probabilistic, scheduling algorithms implicitly exploit this feature to improve running time. Our experiments are done with traces of a real Grid application ( Coadd), and show that our algorithms are able to achieve utilization of over 90%, while reducing makespan significantly compared to task-centric approaches.
KW - Data-intensive applications
KW - Grid environments
KW - Task-centric scheduling
KW - Worker-centric scheduling
UR - http://www.scopus.com/inward/record.url?scp=38349050503&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=38349050503&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-76778-7_7
DO - 10.1007/978-3-540-76778-7_7
M3 - Conference contribution
AN - SCOPUS:38349050503
SN - 9783540767770
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 121
EP - 142
BT - Middleware 2007 - ACM/IFIP/USENIX 8th International Middleware Conference, Proceedings
A2 - Cerqueira, Renato
A2 - Campbell, Roy H.
PB - Springer
T2 - 8th International Middleware Conference, Middleware 2007
Y2 - 26 November 2007 through 30 November 2007
ER -