Research access to in-copyright texts in the humanities

Peter Organisciak, J. Stephen Downie

Research output: Chapter in Book/Report/Conference proceedingChapter


Text analysis in the digital humanities is challenged by legal hurdles, which make it difficult to access and especially to redistribute datasets of modern texts. As large digitisation projects grow, copyright challenges are increasingly acute. We discuss the legal landscape around large bibliographic datasets and explore principles of non-expressive and non-consumptive access as one solution to enabling research access to sensitive texts. Non-consumptive access seeks to make text available in an abstracted but maximally useless form, supporting research use without distributing the original, readable text. The HathiTrust Research Center is presented as a case study of these principles. Devoted to scholarly access to the 17 million works of the Hathirust, the Research Center has been enabling access through feature datasets, high-level visualisation tools, an in-browser analysis suite and a secure virtual machine environment. This assortment of approaches has different strengths and challenges, and we consider how each may be instructive in considering the future of research over sensitive text.

Original languageEnglish (US)
Title of host publicationInformation and Knowledge Organisation in Digital Humanities
Subtitle of host publicationGlobal Perspectives
PublisherTaylor and Francis
Number of pages21
ISBN (Electronic)9781000521153
ISBN (Print)9780367675516
StatePublished - Jan 1 2021

ASJC Scopus subject areas

  • General Arts and Humanities
  • General Social Sciences
  • General Computer Science


Dive into the research topics of 'Research access to in-copyright texts in the humanities'. Together they form a unique fingerprint.

Cite this