DISC: A system for distributed data intensive scientific computing

George Kola, Tevfik Kosar, Jaime Frey, Miron Livny, Robert Brunner, Michael Remijan

Research output: Contribution to conferencePaperpeer-review

Abstract

The increasing computation and data requirements of scientific applications have necessitated the use of distributed resources owned by collaborating parties. While existing distributed systems work well for computation that requires limited data movement, they fail in unexpected ways when the computation accesses, creates, and moves large amounts of data over wide-area networks. In this work, we analyzed the problems with existing systems and used the result of this analysis to design our own system. Realizing that it takes a long while for a new system to stabilize, we tried our best to reuse existing components. We added new components only when we could not get by with adding features to existing ones. We used our system to successfully process three terabytes of DPOSS image data in under a week by using idle CPUs in desktops and commodity clusters in the UW-Madison Computer Science Department and Starlight.

Original languageEnglish (US)
StatePublished - 2004
Event1st USENIX Workshop on Real, Large Distributed Systems, WORLDS 2004 - San Francisco, United States
Duration: Dec 5 2004 → …

Conference

Conference1st USENIX Workshop on Real, Large Distributed Systems, WORLDS 2004
Country/TerritoryUnited States
CitySan Francisco
Period12/5/04 → …

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture
  • Software

Fingerprint

Dive into the research topics of 'DISC: A system for distributed data intensive scientific computing'. Together they form a unique fingerprint.

Cite this