Petabytes in Practice: Working with Collections as Data at Scale

Will R. Thomas, Benjamin Galewsky, Sandeep Puthanveetil Satheesan, Gregory Jansen, Richard Marciano, Shannon Bradley, Jong Lee, Luigi Marini, Kenton McHenry

Research output: Contribution to journalArticlepeer-review


The emerging transdiscipline of Computational Archival Science (CAS) links frameworks such as Brown Dog and repository software such as Digital Repository At Scale To Invite Computation (DRAS-TIC) to yield an understanding of working with digital collections at scale for cultural data. The DRAS-TIC and Brown Dog projects here serve as the basis for an expandable distributed storage/service architecture with on-demand, horizontally scalable integrated digital preservation and analysis services.

Original languageEnglish (US)
Pages (from-to)18-25
Number of pages8
JournalData and Information Management
Issue number1
StatePublished - Mar 1 2019


  • data repositories
  • JCDL workshop proceedings
  • machine learning
  • parallel processing

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Library and Information Sciences


Dive into the research topics of 'Petabytes in Practice: Working with Collections as Data at Scale'. Together they form a unique fingerprint.

Cite this