TY - JOUR
T1 - Locating the Leading Edge of Cultural Change
AU - Griebel, Sarah
AU - Cohen, Becca
AU - Li, Lucian
AU - Park, Jaihyun
AU - Liu, Jiayu
AU - Perkins, Jana
AU - Underwood, Ted
N1 - This work made use of the Illinois Campus Cluster, a computing resource that is operated by the Illinois Campus Cluster Program (ICCP) in conjunction with the National Center for Supercomputing Applications (NCSA) and which is supported by funds from the University of Illinois at Urbana-Champaign\u2014specifically, through the Illinois Computes program. This work also used the Delta system at the National Center for Supercomputing Applications through allocation xras-ncsa-72 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, which is supported by National Science Foundation grants #2138259, #2138286, #2138307, #2137603, and #2138296. Some fiction data for this project was provided by HathiTrust Digital Library [7].
PY - 2024
Y1 - 2024
N2 - Measures of textual similarity and divergence are increasingly used to study cultural change. But which measures align, in practice, with social evidence about change? We apply three different representations of text (topic models, document embeddings, and word-level perplexity) to three different corpora (literary studies, economics, and fiction). In every case, works by highly-cited authors and younger authors are textually ahead of the curve. We don’t find clear evidence that one representation of text is to be preferred over the others. But alignment with social evidence is strongest when texts are represented through the top quartile of passages, suggesting that a text’s impact may depend more on its most forward-looking moments than on sustaining a high level of innovation throughout.
AB - Measures of textual similarity and divergence are increasingly used to study cultural change. But which measures align, in practice, with social evidence about change? We apply three different representations of text (topic models, document embeddings, and word-level perplexity) to three different corpora (literary studies, economics, and fiction). In every case, works by highly-cited authors and younger authors are textually ahead of the curve. We don’t find clear evidence that one representation of text is to be preferred over the others. But alignment with social evidence is strongest when texts are represented through the top quartile of passages, suggesting that a text’s impact may depend more on its most forward-looking moments than on sustaining a high level of innovation throughout.
KW - bibliometrics
KW - cultural change
KW - document embeddings
KW - fiction
KW - topic modeling
UR - https://www.scopus.com/pages/publications/85210845410
UR - https://www.scopus.com/pages/publications/85210845410#tab=citedBy
M3 - Conference article
AN - SCOPUS:85210845410
SN - 1613-0073
VL - 3834
SP - 232
EP - 245
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 2024 Computational Humanities Research Conference, CHR 2024
Y2 - 4 December 2024 through 6 December 2024
ER -