Four best practices for measuring news sentiment using ‘off-the-shelf’ dictionaries: a large-scale p-hacking experiment

Chung Hong Chan, Joseph Bajjalieh, Loretta Auvil, Hartmut Wessler, Scott Althaus, Kasper Welbers, Wouter Van Atteveldt, Marc Jungblut

Research output: Contribution to journalArticlepeer-review

Abstract

We examined the validity of 37 sentiment scores based on dictionary-based methods using a large news corpus and demonstrated the risk of generating a spectrum of results with different levels of statistical significance by presenting an analysis of relationships between news sentiment and U.S. presidential approval. We summarize our findings into four best practices: 1) use a suitable sentiment dictionary; 2) do not assume that the validity and reliability of the dictionary is ‘built-in’; 3) check for the influence of content length and 4) do not use multiple dictionaries to test the same statistical hypothesis.

Original languageEnglish (US)
Pages (from-to)1-27
Number of pages27
JournalComputational Communication Research
Volume3
Issue number1
DOIs
StatePublished - Mar 2021

Keywords

  • Agenda setting
  • News sentiment
  • P-hacking
  • Sentiment analysis
  • Text-as-data
  • Validity

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Four best practices for measuring news sentiment using ‘off-the-shelf’ dictionaries: a large-scale p-hacking experiment'. Together they form a unique fingerprint.

Cite this