Giving shape to large digital libraries through exploratory data analysis

Peter Organisciak, Benjamin M. Schmidt, J. Stephen Downie

Research output: Contribution to journalArticlepeer-review


The emergence of large multi-institutional digital libraries has opened the door to aggregate-level examinations of the published word. Such large-scale analysis offers a new way to pursue traditional problems in the humanities and social sciences, using digital methods to ask routine questions of large corpora. However, inquiry into multiple centuries of books is constrained by the burdens of scale, where statistical inference is technically complex and limited by hurdles to access and flexibility. This work examines the role that exploratory data analysis and visualization tools may play in understanding large bibliographic datasets. We present one such tool, HathiTrust+Bookworm, which allows multifaceted exploration of the multimillion work HathiTrust Digital Library, and center it in the broader space of scholarly tools for exploratory data analysis.

Original languageEnglish (US)
Pages (from-to)317-332
Number of pages16
JournalJournal of the Association for Information Science and Technology
Issue number2
StateAccepted/In press - 2021

ASJC Scopus subject areas

  • Information Systems
  • Computer Networks and Communications
  • Information Systems and Management
  • Library and Information Sciences


Dive into the research topics of 'Giving shape to large digital libraries through exploratory data analysis'. Together they form a unique fingerprint.

Cite this