TY - GEN
T1 - A fresh perspective on developing and executing DAG-based distributed applications
T2 - 5th IEEE International Conference on e-Science, e-Science 2009
AU - Merzky, Andre
AU - Stamou, Katerina
AU - Jha, Shantenu
AU - Katz, Daniel S.
PY - 2009
Y1 - 2009
N2 - Most workflow based applications currently have to adapt to available tools. While this keeps the cost of development low, it can lead to performance and flexibility tradeoffs that the application developer and deployer must make. In this paper, we use the Montage astronomical image mosaicking application as prototypical DAG-based workflow application to layout the development and deployment decisions for distributed applications. We discuss and explain the lack of simple (easy-to-use), scalable, and extensible distributed applications. We then introduce SAGA as a technology that permits the construction of abstractions that aid the development and execution of the applications, and thus addresses some of common shortcomings of traditional distributed applications development. We use Montage together with SAGA to examine how legacy applications can be made to run on distributed infrastructures, to see if our reasons are valid, and to compare potential new methods for creating distributed applications with existing technologies that are currently used. We demonstrate the ability to (i) scale-out and (ii) use different production infrastructure, while maintaining performance comparable to established systems. Our hope is that by demonstrating the simplicity of development along with other advantages (performance, scalability, extensibility, and infrastructure independence), this example will encourage others to think more broadly about how distributed applications are created and how new programming models such as Dryad can be supported in an infrastructure independent way, thus eventually leading to more applications that can seamlessly scale-out.
AB - Most workflow based applications currently have to adapt to available tools. While this keeps the cost of development low, it can lead to performance and flexibility tradeoffs that the application developer and deployer must make. In this paper, we use the Montage astronomical image mosaicking application as prototypical DAG-based workflow application to layout the development and deployment decisions for distributed applications. We discuss and explain the lack of simple (easy-to-use), scalable, and extensible distributed applications. We then introduce SAGA as a technology that permits the construction of abstractions that aid the development and execution of the applications, and thus addresses some of common shortcomings of traditional distributed applications development. We use Montage together with SAGA to examine how legacy applications can be made to run on distributed infrastructures, to see if our reasons are valid, and to compare potential new methods for creating distributed applications with existing technologies that are currently used. We demonstrate the ability to (i) scale-out and (ii) use different production infrastructure, while maintaining performance comparable to established systems. Our hope is that by demonstrating the simplicity of development along with other advantages (performance, scalability, extensibility, and infrastructure independence), this example will encourage others to think more broadly about how distributed applications are created and how new programming models such as Dryad can be supported in an infrastructure independent way, thus eventually leading to more applications that can seamlessly scale-out.
UR - http://www.scopus.com/inward/record.url?scp=77949778068&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77949778068&partnerID=8YFLogxK
U2 - 10.1109/e-Science.2009.40
DO - 10.1109/e-Science.2009.40
M3 - Conference contribution
AN - SCOPUS:77949778068
SN - 9780769538778
T3 - e-Science 2009 - 5th IEEE International Conference on e-Science
SP - 231
EP - 238
BT - e-Science 2009 - 5th IEEE International Conference on e-Science
Y2 - 9 December 2009 through 11 December 2009
ER -