What are You Saying? Using Topic to Detect Financial Misreporting

Nerissa C. Brown, Richard M. Crowley, W. Brooke Elliott

Research output: Working paper

Abstract

This study uses a machine learning technique to assess whether the thematic content of financial statement disclosures (labeled as topic) is incrementally informative in predicting intentional misreporting. Using a Bayesian topic modeling algorithm, we determine and empirically quantify the topic content of a large collection of 10-K narratives spanning the 1994 to 2012 period. We find that the algorithm produces a valid set of semantically meaningful topics that are predictive of financial misreporting based on samples of SEC enforcement actions (AAERs) and irregularity restatements arising from intentional GAAP violations. Our out-of-sample tests indicate that topic significantly improves the detection of financial misreporting when added to models based on commonly-used financial and textual style variables. Furthermore, we find that models including topic outperform traditional models when predicting long-duration misstatements. These results are robust to alternative topic definitions and regression specifications and various controls for firms with repeated instances of financial misreporting.
Original languageEnglish (US)
Number of pages76
DOIs
StatePublished - Jul 5 2016

Publication series

Name27th Annual Conference on Financial Economics and Accounting Paper

Fingerprint

Misreporting
Enforcement
Violations
Machine learning
Disclosure
Financial statements
Modeling
Restatements
Topic model

Keywords

  • Topic
  • Disclosure
  • Latent Dirichlet Allocation
  • Financial Misreporting

Cite this

Brown, N. C., Crowley, R. M., & Elliott, W. B. (2016). What are You Saying? Using Topic to Detect Financial Misreporting. (27th Annual Conference on Financial Economics and Accounting Paper). https://doi.org/10.2139/ssrn.2803733

What are You Saying? Using Topic to Detect Financial Misreporting. / Brown, Nerissa C.; Crowley, Richard M.; Elliott, W. Brooke.

2016. (27th Annual Conference on Financial Economics and Accounting Paper).

Research output: Working paper

Brown, NC, Crowley, RM & Elliott, WB 2016 'What are You Saying? Using Topic to Detect Financial Misreporting' 27th Annual Conference on Financial Economics and Accounting Paper. https://doi.org/10.2139/ssrn.2803733
Brown NC, Crowley RM, Elliott WB. What are You Saying? Using Topic to Detect Financial Misreporting. 2016 Jul 5. (27th Annual Conference on Financial Economics and Accounting Paper). https://doi.org/10.2139/ssrn.2803733
Brown, Nerissa C. ; Crowley, Richard M. ; Elliott, W. Brooke. / What are You Saying? Using Topic to Detect Financial Misreporting. 2016. (27th Annual Conference on Financial Economics and Accounting Paper).
@techreport{0bb31868608246abbc691dbe63e906fa,
title = "What are You Saying? Using Topic to Detect Financial Misreporting",
abstract = "This study uses a machine learning technique to assess whether the thematic content of financial statement disclosures (labeled as topic) is incrementally informative in predicting intentional misreporting. Using a Bayesian topic modeling algorithm, we determine and empirically quantify the topic content of a large collection of 10-K narratives spanning the 1994 to 2012 period. We find that the algorithm produces a valid set of semantically meaningful topics that are predictive of financial misreporting based on samples of SEC enforcement actions (AAERs) and irregularity restatements arising from intentional GAAP violations. Our out-of-sample tests indicate that topic significantly improves the detection of financial misreporting when added to models based on commonly-used financial and textual style variables. Furthermore, we find that models including topic outperform traditional models when predicting long-duration misstatements. These results are robust to alternative topic definitions and regression specifications and various controls for firms with repeated instances of financial misreporting.",
keywords = "Topic, Disclosure, Latent Dirichlet Allocation, Financial Misreporting",
author = "Brown, {Nerissa C.} and Crowley, {Richard M.} and Elliott, {W. Brooke}",
year = "2016",
month = "7",
day = "5",
doi = "10.2139/ssrn.2803733",
language = "English (US)",
series = "27th Annual Conference on Financial Economics and Accounting Paper",
type = "WorkingPaper",

}

TY - UNPB

T1 - What are You Saying? Using Topic to Detect Financial Misreporting

AU - Brown, Nerissa C.

AU - Crowley, Richard M.

AU - Elliott, W. Brooke

PY - 2016/7/5

Y1 - 2016/7/5

N2 - This study uses a machine learning technique to assess whether the thematic content of financial statement disclosures (labeled as topic) is incrementally informative in predicting intentional misreporting. Using a Bayesian topic modeling algorithm, we determine and empirically quantify the topic content of a large collection of 10-K narratives spanning the 1994 to 2012 period. We find that the algorithm produces a valid set of semantically meaningful topics that are predictive of financial misreporting based on samples of SEC enforcement actions (AAERs) and irregularity restatements arising from intentional GAAP violations. Our out-of-sample tests indicate that topic significantly improves the detection of financial misreporting when added to models based on commonly-used financial and textual style variables. Furthermore, we find that models including topic outperform traditional models when predicting long-duration misstatements. These results are robust to alternative topic definitions and regression specifications and various controls for firms with repeated instances of financial misreporting.

AB - This study uses a machine learning technique to assess whether the thematic content of financial statement disclosures (labeled as topic) is incrementally informative in predicting intentional misreporting. Using a Bayesian topic modeling algorithm, we determine and empirically quantify the topic content of a large collection of 10-K narratives spanning the 1994 to 2012 period. We find that the algorithm produces a valid set of semantically meaningful topics that are predictive of financial misreporting based on samples of SEC enforcement actions (AAERs) and irregularity restatements arising from intentional GAAP violations. Our out-of-sample tests indicate that topic significantly improves the detection of financial misreporting when added to models based on commonly-used financial and textual style variables. Furthermore, we find that models including topic outperform traditional models when predicting long-duration misstatements. These results are robust to alternative topic definitions and regression specifications and various controls for firms with repeated instances of financial misreporting.

KW - Topic

KW - Disclosure

KW - Latent Dirichlet Allocation

KW - Financial Misreporting

U2 - 10.2139/ssrn.2803733

DO - 10.2139/ssrn.2803733

M3 - Working paper

T3 - 27th Annual Conference on Financial Economics and Accounting Paper

BT - What are You Saying? Using Topic to Detect Financial Misreporting

ER -