TY - GEN
T1 - Planning large data transfers in institutional grids
AU - Bouabache, Fatiha
AU - Herault, Thomas
AU - Peyronnet, Sylvain
AU - Cappello, Franck
PY - 2010
Y1 - 2010
N2 - In grid computing, many scientific and engineering applications require access to large amounts of distributed data. The size and number of these data collections has been growing rapidly in recent years. The costs of data transmission take a significant part of the global execution time. When communication streams flow concurrently on shared links, transport control protocols have issues allocating fair bandwidth to all the streams, and the network becomes sub-optimally used. One way to deal with this situation is to schedule the communications in a way that will induce an optimal use of the network. We focus on the case of large data transfers that can be completely described at the initialization time. In this case, a plan of data migration can be computed at initialization time, and then executed. However, this computation phase must take a small time when compared to the actual execution of the plan. We propose a best effort solution, to compute approximately, based on the uniform random sampling of possible schedules, a communication plan. We show the effectiveness of this approach both theoretically and by simulations.
AB - In grid computing, many scientific and engineering applications require access to large amounts of distributed data. The size and number of these data collections has been growing rapidly in recent years. The costs of data transmission take a significant part of the global execution time. When communication streams flow concurrently on shared links, transport control protocols have issues allocating fair bandwidth to all the streams, and the network becomes sub-optimally used. One way to deal with this situation is to schedule the communications in a way that will induce an optimal use of the network. We focus on the case of large data transfers that can be completely described at the initialization time. In this case, a plan of data migration can be computed at initialization time, and then executed. However, this computation phase must take a small time when compared to the actual execution of the plan. We propose a best effort solution, to compute approximately, based on the uniform random sampling of possible schedules, a communication plan. We show the effectiveness of this approach both theoretically and by simulations.
KW - Data transfer
KW - Scheduling
UR - http://www.scopus.com/inward/record.url?scp=77954949428&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954949428&partnerID=8YFLogxK
U2 - 10.1109/CCGRID.2010.68
DO - 10.1109/CCGRID.2010.68
M3 - Conference contribution
AN - SCOPUS:77954949428
SN - 9781424469871
T3 - CCGrid 2010 - 10th IEEE/ACM International Conference on Cluster, Cloud, and Grid Computing
SP - 547
EP - 552
BT - CCGrid 2010 - 10th IEEE/ACM International Conference on Cluster, Cloud, and Grid Computing
T2 - 10th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2010
Y2 - 17 May 2010 through 20 May 2010
ER -