TY - JOUR
T1 - BitDew
T2 - A data management and distribution service with multi-protocol file transfer and metadata abstraction
AU - Fedak, Gilles
AU - He, Haiwu
AU - Cappello, Franck
N1 - Funding Information:
Experiments presented in this paper were carried out using the DSL-Lab experimental testbed, an initiative supported by the French ANR JCJC program (see http://www.dsllab.org ) under Grant JC05_55975. This research is also funded in part by the European FP6 project Grid4all.
PY - 2009/9
Y1 - 2009/9
N2 - Desktop Grids use the computing, network and storage resources from idle desktop PCs distributed over multiple-LANs or the Internet to compute a large variety of resource-demanding distributed applications. While these applications need to access, compute, store and circulate large volumes of data, little attention has been paid to data management in such large-scale, dynamic, heterogeneous, volatile and highly distributed Grids. In most cases, data management relies on ad hoc solutions, and providing a general approach is still a challenging issue. A new class of data management service is desirable to deal with such a variety of file transfer protocols than client/server, P2P or the new and emerging Cloud storage service. To address this problem, we propose the BitDew framework, a programmable environment for automatic and transparent data management on computational Desktop Grids. This paper describes the BitDew programming interface, its architecture, and the performance evaluation of its runtime components. BitDew relies on a specific set of metadata to drive key data management operations, namely life cycle, distribution, placement, replication and fault tolerance with a high level of abstraction. The BitDew runtime environment is a flexible distributed service architecture that integrates modular P2P components such as DHTs (Distributed Hash Tables) for a Distributed Data Catalog and collaborative transport protocols for data distribution. We explain how to plug-in new or existing protocols and we give evidence of the versatility of the framework by implementing HTTP, FTP and BitTorrent protocols and access to the Amazon S3 and IBP Wide Area Storage. We describe the mechanisms used to provide asynchronous and reliable multi-protocols transfers. Through several examples, we describe how application programmers and BitDew users can exploit BitDew's features. We report on performance evaluation using micro-benchmarks, various usage scenarios and data-intense bioinformatics application, both in the Grid context and on the Internet. The performance evaluation demonstrates that the high level of abstraction and transparency is obtained with a reasonable overhead, while offering the benefit of scalability, performance and fault tolerance with little programming cost.
AB - Desktop Grids use the computing, network and storage resources from idle desktop PCs distributed over multiple-LANs or the Internet to compute a large variety of resource-demanding distributed applications. While these applications need to access, compute, store and circulate large volumes of data, little attention has been paid to data management in such large-scale, dynamic, heterogeneous, volatile and highly distributed Grids. In most cases, data management relies on ad hoc solutions, and providing a general approach is still a challenging issue. A new class of data management service is desirable to deal with such a variety of file transfer protocols than client/server, P2P or the new and emerging Cloud storage service. To address this problem, we propose the BitDew framework, a programmable environment for automatic and transparent data management on computational Desktop Grids. This paper describes the BitDew programming interface, its architecture, and the performance evaluation of its runtime components. BitDew relies on a specific set of metadata to drive key data management operations, namely life cycle, distribution, placement, replication and fault tolerance with a high level of abstraction. The BitDew runtime environment is a flexible distributed service architecture that integrates modular P2P components such as DHTs (Distributed Hash Tables) for a Distributed Data Catalog and collaborative transport protocols for data distribution. We explain how to plug-in new or existing protocols and we give evidence of the versatility of the framework by implementing HTTP, FTP and BitTorrent protocols and access to the Amazon S3 and IBP Wide Area Storage. We describe the mechanisms used to provide asynchronous and reliable multi-protocols transfers. Through several examples, we describe how application programmers and BitDew users can exploit BitDew's features. We report on performance evaluation using micro-benchmarks, various usage scenarios and data-intense bioinformatics application, both in the Grid context and on the Internet. The performance evaluation demonstrates that the high level of abstraction and transparency is obtained with a reasonable overhead, while offering the benefit of scalability, performance and fault tolerance with little programming cost.
KW - Cloud computing
KW - Content network
KW - Desktop Grid
KW - P2P
UR - http://www.scopus.com/inward/record.url?scp=67649831348&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=67649831348&partnerID=8YFLogxK
U2 - 10.1016/j.jnca.2009.04.002
DO - 10.1016/j.jnca.2009.04.002
M3 - Article
AN - SCOPUS:67649831348
SN - 1084-8045
VL - 32
SP - 961
EP - 975
JO - Journal of Network and Computer Applications
JF - Journal of Network and Computer Applications
IS - 5
ER -