TY - GEN
T1 - BLAST application with data-aware desktop grid middleware
AU - He, Haiwu
AU - Fedak, Gilles
AU - Tang, Bing
AU - Cappello, Franck
PY - 2009
Y1 - 2009
N2 - There exists numerous Grid middleware to develop and execute programs on the computational Grid, but they still require intensive work from their users. BitDew is made to facilitate the usage of large scale Grid with dynamic, heterogeneous, volatile and highly distributed computing resources for applications that require a huge amount of data processing. Data-intensive applications form an important class of applications for the e- Science community which require secure and coordinated access to large datasets, wide-area transfers and broad distribution of TeraBytes of data while keeping track of multiple data replicas. In genetic biology, gene sequences comparison and analysis are the most basic routines. With the considerable increase of sequences to analyze, we need more and more computing power as well as efficient solution to manage data. In this work, we investigate the advantages of using a new Desktop Grid middleware BitDew, designed for large scale data management. Our contribution is two-fold: firstly, we introduce a data-driven Master/Slave programming model and we present an implementation of BLAST over BitDew following this model, secondly, we present extensive experimental and simulation results which demonstrate the effectiveness and scalability of our approach. We evaluate the benefit of multi-protocol data distribution to achieve remarkable speedups, we report on the ability to cope with highly volatile environment with relative performance degradation, we show the benefit of data replication in Grid with heterogeneous resource performance and we evaluate the combination of data fault tolerance and data replication when computing on volatile resources.
AB - There exists numerous Grid middleware to develop and execute programs on the computational Grid, but they still require intensive work from their users. BitDew is made to facilitate the usage of large scale Grid with dynamic, heterogeneous, volatile and highly distributed computing resources for applications that require a huge amount of data processing. Data-intensive applications form an important class of applications for the e- Science community which require secure and coordinated access to large datasets, wide-area transfers and broad distribution of TeraBytes of data while keeping track of multiple data replicas. In genetic biology, gene sequences comparison and analysis are the most basic routines. With the considerable increase of sequences to analyze, we need more and more computing power as well as efficient solution to manage data. In this work, we investigate the advantages of using a new Desktop Grid middleware BitDew, designed for large scale data management. Our contribution is two-fold: firstly, we introduce a data-driven Master/Slave programming model and we present an implementation of BLAST over BitDew following this model, secondly, we present extensive experimental and simulation results which demonstrate the effectiveness and scalability of our approach. We evaluate the benefit of multi-protocol data distribution to achieve remarkable speedups, we report on the ability to cope with highly volatile environment with relative performance degradation, we show the benefit of data replication in Grid with heterogeneous resource performance and we evaluate the combination of data fault tolerance and data replication when computing on volatile resources.
KW - BLAST
KW - BitDew
KW - Data-aware middleware
KW - Desktop grids
UR - http://www.scopus.com/inward/record.url?scp=70349733004&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349733004&partnerID=8YFLogxK
U2 - 10.1109/CCGRID.2009.91
DO - 10.1109/CCGRID.2009.91
M3 - Conference contribution
AN - SCOPUS:70349733004
SN - 9780769536224
T3 - 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGRID 2009
SP - 284
EP - 291
BT - 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGRID 2009
T2 - 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, CCGRID 2009
Y2 - 18 May 2009 through 21 May 2009
ER -