What Are You Saying? Using topic to Detect Financial Misreporting

Nerissa C. Brown, Richard M. Crowley, W. Brooke Elliott

Research output: Contribution to journalArticlepeer-review


We use a machine learning technique to assess whether the thematic content of financial statement disclosures (labeled topic) is incrementally informative in predicting intentional misreporting. Using a Bayesian topic modeling algorithm, we determine and empirically quantify the topic content of a large collection of 10-K narratives spanning 1994 to 2012. We find that the algorithm produces a valid set of semantically meaningful topics that predict financial misreporting, based on samples of Securities and Exchange Commission (SEC) enforcement actions (Accounting and Auditing Enforcement Releases [AAERs]) and irregularities identified from financial restatements and 10-K filing amendments. Our out-of-sample tests indicate that topic significantly improves the detection of financial misreporting by as much as 59% when added to models based on commonly used financial and textual style variables. Furthermore, models that incorporate topic significantly outperform traditional models when detecting serious revenue recognition and core expense errors. Taken together, our results suggest that the topics discussed in annual report filings and the attention devoted to each topic are useful signals in detecting financial misreporting.

Original languageEnglish (US)
Pages (from-to)237-291
Number of pages55
JournalJournal of Accounting Research
Issue number1
StatePublished - Mar 1 2020


  • C80
  • K22
  • K42
  • M40
  • M41
  • M48
  • disclosure
  • financial misreporting
  • latent Dirichlet allocation
  • topic modeling

ASJC Scopus subject areas

  • Accounting
  • Finance
  • Economics and Econometrics


Dive into the research topics of 'What Are You Saying? Using topic to Detect Financial Misreporting'. Together they form a unique fingerprint.

Cite this