New worker-centric scheduling strategies for data-intensive grid applications

Steven Y. Ko, Ramsés Morales, Indranil Gupta

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Distributed computations, dealing with large amounts of data, are scheduled in Grid clusters today using either a task-centric mechanism, or a worker-centric mechanism. Because of the large data sets, the execution time is bounded by the cost of data transfer. In this paper, we introduce new worker-centric scheduling strategies that are novel in that they aim to implicitly exploit the locality of interest in order to reduce the cost of data transfer. Many Grid applications are characterized by such a locality of interest, i.e., a file is often accessed by multiple tasks and, more importantly, a set of files that are accessed by one task are also likely to be accessed together by other tasks. Our new deterministic, as well as probabilistic, scheduling algorithms implicitly exploit this feature to improve running time. Our experiments are done with traces of a real Grid application ( Coadd), and show that our algorithms are able to achieve utilization of over 90%, while reducing makespan significantly compared to task-centric approaches.

Original languageEnglish (US)
Title of host publicationMiddleware 2007 - ACM/IFIP/USENIX 8th International Middleware Conference, Proceedings
EditorsRenato Cerqueira, Roy H. Campbell
PublisherSpringer
Pages121-142
Number of pages22
ISBN (Print)9783540767770
DOIs
StatePublished - 2007
Event8th International Middleware Conference, Middleware 2007 - Newport Beach, CA, United States
Duration: Nov 26 2007Nov 30 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4834 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other8th International Middleware Conference, Middleware 2007
Country/TerritoryUnited States
CityNewport Beach, CA
Period11/26/0711/30/07

Keywords

  • Data-intensive applications
  • Grid environments
  • Task-centric scheduling
  • Worker-centric scheduling

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'New worker-centric scheduling strategies for data-intensive grid applications'. Together they form a unique fingerprint.

Cite this