The Literary Uses of High-Dimensional Space

Research output: Contribution to journalArticlepeer-review


Debates over “Big Data” shed more heat than light in the humanities, because the term ascribes new importance to statistical methods without explaining how those methods have changed. What we badly need instead is a conversation about the substantive innovations that have made statistical modeling useful for disciplines where, in the past, it truly wasn’t. These innovations are partly technical, but more fundamentally expressed in what Leo Breiman calls a new “culture” of statistical modeling. Where 20th-century methods often required humanists to squeeze our unstructured texts, sounds, or images into some special-purpose data model, new methods can handle unstructured evidence more directly by modeling it in a high-dimensional space. This opens a range of research opportunities that humanists have barely begun to discuss. To date, topic modeling has received most attention, but in the long run, supervised predictive models may be even more important. I sketch their potential by describing how Jordan Sellers and I have begun to model poetic distinction in the long 19th century—revealing an arc of gradual change much longer than received literary histories would lead us to expect.
Original languageEnglish (US)
Number of pages6
JournalBig Data & Society
Issue number2
StatePublished - Dec 27 2015


  • Literary distinction
  • bag of words
  • literary theory
  • machine learning
  • poetic diction
  • predictive modeling

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Information Systems and Management
  • Communication
  • Library and Information Sciences


Dive into the research topics of 'The Literary Uses of High-Dimensional Space'. Together they form a unique fingerprint.

Cite this