EXTRACTING ACTIONABLE INSIGHTS FROM TEXT DATA: A STABLE TOPIC MODEL APPROACH

Research output: Contribution to journalArticlepeer-review

Abstract

Topic models are becoming a frequently employed tool in the empirical methods repertoire of information systems and management scholars. Given textual corpora, such as consumer reviews and online discussion forums, researchers and business practitioners often use topic modeling to either explore data in an unsupervised fashion or generate variables of interest for subsequent econometric analysis. However, one important concern stems from the fact that topic models can be notorious for their instability, i.e., the generated results could be inconsistent and irreproducible at different times, even on the same dataset. Therefore, researchers might arrive at potentially unreliable results regarding the theoretical relationships that they are testing or developing. In this paper, we attempt to highlight this problem and suggest a potential approach to addressing it. First, we empirically define and evaluate the stability problem of topic models using four textual datasets. Next, to alleviate the problem and with the goal of extracting actionable insights from textual data, we propose a new method, Stable LDA, which incorporates topical word clusters into the topic model to steer the model inference toward consistent results. We show that the proposed Stable LDA approach can significantly improve model stability while maintaining or even improving the topic model quality. Further, employing two case studies related to an online knowledge community and online consumer reviews, we demonstrate that the variables generated from Stable LDA can lead to more consistent estimations in econometric analyses. We believe that our work can further enhance management scholars’ collective toolkit to analyze ever-growing textual data.

Original languageEnglish (US)
Pages (from-to)923-954
Number of pages32
JournalMIS Quarterly: Management Information Systems
Volume47
Issue number3
Early online dateJun 1 2022
DOIs
StatePublished - Sep 2023

Keywords

  • Stable LDA
  • Topic modeling
  • empirical analysis
  • stability
  • text analysis

ASJC Scopus subject areas

  • Management Information Systems
  • Information Systems
  • Computer Science Applications
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'EXTRACTING ACTIONABLE INSIGHTS FROM TEXT DATA: A STABLE TOPIC MODEL APPROACH'. Together they form a unique fingerprint.

Cite this