TY - GEN
T1 - Brown Dog - A Science Driven Data Transformation Service
AU - McHenry, Kenton Guadron
AU - Lee, Jong Sung
AU - Kumar, Praveen
AU - Minsker, Barbara S
AU - Dietze, Michael C.
AU - Marciano, Richard
AU - Alameda, Jay
AU - Bradley, Shannon
AU - Marini, Luigi
AU - Kooper, Rob
AU - Navarro, Christopher M
AU - Padhy, Smruti
AU - Jansen, Greg
AU - Slavenas, Marcus
AU - Puthanveetil Satheesan, Sandeep
AU - Zhao, Yan
AU - Zhang, Bing
AU - Zharnitsky, Inna
AU - Roeder, Eugene
PY - 2016
Y1 - 2016
N2 - With growing diverse volumes of digital data becoming part of modern scientific workflows, many research projects today begin with a process of data wrangling, i.e. finding, manipulating, indexing, cleaning, and bringing together needed datasets. Brown Dog, a Science Driven Data Transformation service, aims to alleviate much of the overhead and heterogeneity involved in this step, which in turn hinders scientific reproducibility, by providing data transformations such as format conversions and content based extractions as a service. Through a REST API Brown Dog supports diverse usage by various clients such as gateways, programming languages, and tools. As a gateway it provides a venue to access and preserve data transformation tools, track provenance, track information loss, manage data movement, and process jobs in a scalable manner across a diverse set of computational resources. Overall, Brown Dog provides a low level data infrastructure to interface with digital data contents and through its capabilities enable a new era of science and applications at large over otherwise difficult to access datasets. Further, Brown Dog aims to serve not just the scientific community but the general public as a “DNS” for data, moving civilization towards an era where applications can be largely agnostic to the format/structure of the data and can instead focus on novel processes/applications on the contents.
AB - With growing diverse volumes of digital data becoming part of modern scientific workflows, many research projects today begin with a process of data wrangling, i.e. finding, manipulating, indexing, cleaning, and bringing together needed datasets. Brown Dog, a Science Driven Data Transformation service, aims to alleviate much of the overhead and heterogeneity involved in this step, which in turn hinders scientific reproducibility, by providing data transformations such as format conversions and content based extractions as a service. Through a REST API Brown Dog supports diverse usage by various clients such as gateways, programming languages, and tools. As a gateway it provides a venue to access and preserve data transformation tools, track provenance, track information loss, manage data movement, and process jobs in a scalable manner across a diverse set of computational resources. Overall, Brown Dog provides a low level data infrastructure to interface with digital data contents and through its capabilities enable a new era of science and applications at large over otherwise difficult to access datasets. Further, Brown Dog aims to serve not just the scientific community but the general public as a “DNS” for data, moving civilization towards an era where applications can be largely agnostic to the format/structure of the data and can instead focus on novel processes/applications on the contents.
KW - Gateways 2016
KW - Web Technologies (excl. Web Search)
KW - Computer Software
KW - Applied Computer Science
KW - Distributed and Grid Systems
KW - Distributed Computing
KW - Science Gateways
KW - SGCI
U2 - 10.6084/m9.figshare.4490735.v2
DO - 10.6084/m9.figshare.4490735.v2
M3 - Conference contribution
BT - Gateways 2016
ER -