Text-Mining the Humanities

Matthew L. Jockers, Ted Underwood

Research output: Chapter in Book/Report/Conference proceedingEntry for encyclopedia/dictionary

Abstract

This chapter provides a broad overview of how text mining can be usefully employed in humanistic research. The chapter begins by addressing the question of why scholars in the humanities should care about text mining and what they might expect to gain by embracing what are deeply computational and deeply quantitative methods. We then offer a quick synopsis of the key watersheds in the history of text mining. The bulk of the chapter discusses central methodologies used in humanistic text mining. Using examples from the humanities, we unpack the differences between supervised and unsupervised learning and discuss how tools developed by researchers in other fields can be usefully employed to address humanistic questions. Drawing from personal experience, we address some of the significant challenges associated with data quality, metadata, and copyright restrictions before moving to a discussion of a few exemplary projects and resources for further study.

Original languageEnglish (US)
Title of host publicationA New Companion to Digital Humanities
EditorsSusan Schreibman, Ray Siemens, John Unsworth
PublisherWiley-Blackwell
Pages291-306
Number of pages16
ISBN (Electronic)9781118680605
ISBN (Print)9781118680599
DOIs
StatePublished - Dec 15 2015

Keywords

  • Machine learning
  • Supervised learning
  • Text analysis
  • Text mining

ASJC Scopus subject areas

  • General Arts and Humanities

Fingerprint

Dive into the research topics of 'Text-Mining the Humanities'. Together they form a unique fingerprint.

Cite this