What are You Saying? Using Topic to Detect Financial Misreporting

Nerissa C. Brown, Richard M. Crowley, W. Brooke Elliott

Research output: Working paper


This study uses a machine learning technique to assess whether the thematic content of financial statement disclosures (labeled as topic) is incrementally informative in predicting intentional misreporting. Using a Bayesian topic modeling algorithm, we determine and empirically quantify the topic content of a large collection of 10-K narratives spanning the 1994 to 2012 period. We find that the algorithm produces a valid set of semantically meaningful topics that are predictive of financial misreporting based on samples of SEC enforcement actions (AAERs) and irregularity restatements arising from intentional GAAP violations. Our out-of-sample tests indicate that topic significantly improves the detection of financial misreporting when added to models based on commonly-used financial and textual style variables. Furthermore, we find that models including topic outperform traditional models when predicting long-duration misstatements. These results are robust to alternative topic definitions and regression specifications and various controls for firms with repeated instances of financial misreporting.
Original languageEnglish (US)
Number of pages76
StatePublished - Jul 5 2016

Publication series

Name27th Annual Conference on Financial Economics and Accounting Paper


  • Topic
  • Disclosure
  • Latent Dirichlet Allocation
  • Financial Misreporting


Dive into the research topics of 'What are You Saying? Using Topic to Detect Financial Misreporting'. Together they form a unique fingerprint.

Cite this