Using the HTRC data capsule model to promote reuse and evolution of experimental analysis of digital library data: A case study of topic modeling

David Bainbridge, David M. Nichols, Annika Hinze, J. Stephen Downie

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We report on a case-study to independently reproduce the work given in a publicly available blog on how to develop a topic model sourced from a collection of texts, where both the data set and source code used are readily available. More specifically, we detail the steps necessary-and the challenges that had to be overcome-to replicate the work using the HathiTrust Research Center's virtual machine Data Capsule platform. From this we make recommendations for authors to follow, based on the lessons learned. We also show that the Data Capsule model can be put to work in a way that is of benefit to those interested in supporting computational reproducibility within their organizations.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 ACM/IEEE Joint Conference on Digital Libraries, JCDL 2019
EditorsMaria Bonn, Dan Wu, Stephen J. Downie, Alain Martaus
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages463-464
Number of pages2
ISBN (Electronic)9781728115474
DOIs
StatePublished - Jun 2019
Event19th ACM/IEEE Joint Conference on Digital Libraries, JCDL 2019 - Urbana-Champaign, United States
Duration: Jun 2 2019Jun 6 2019

Publication series

NameProceedings of the ACM/IEEE Joint Conference on Digital Libraries
Volume2019-June
ISSN (Print)1552-5996

Conference

Conference19th ACM/IEEE Joint Conference on Digital Libraries, JCDL 2019
Country/TerritoryUnited States
CityUrbana-Champaign
Period6/2/196/6/19

Keywords

  • Digital-Libraries
  • Experimental-Reproducibility
  • Virtual-Machine

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Using the HTRC data capsule model to promote reuse and evolution of experimental analysis of digital library data: A case study of topic modeling'. Together they form a unique fingerprint.

Cite this