Abstract
Debates over “Big Data” shed more heat than light in the humanities, because the term ascribes new importance to statistical methods without explaining how those methods have changed. What we badly need instead is a conversation about the substantive innovations that have made statistical modeling useful for disciplines where, in the past, it truly wasn’t. These innovations are partly technical, but more fundamentally expressed in what Leo Breiman calls a new “culture” of statistical modeling. Where 20th-century methods often required humanists to squeeze our unstructured texts, sounds, or images into some special-purpose data model, new methods can handle unstructured evidence more directly by modeling it in a high-dimensional space. This opens a range of research opportunities that humanists have barely begun to discuss. To date, topic modeling has received most attention, but in the long run, supervised predictive models may be even more important. I sketch their potential by describing how Jordan Sellers and I have begun to model poetic distinction in the long 19th century—revealing an arc of gradual change much longer than received literary histories would lead us to expect.
Original language | English (US) |
---|---|
Number of pages | 6 |
Journal | Big Data & Society |
Volume | 2 |
Issue number | 2 |
DOIs | |
State | Published - Dec 27 2015 |
Keywords
- Literary distinction
- bag of words
- literary theory
- machine learning
- poetic diction
- predictive modeling
ASJC Scopus subject areas
- Computer Science Applications
- Information Systems
- Information Systems and Management
- Communication
- Library and Information Sciences