Detecting the correlation between sentiment and user-level as well as text-level meta-data from benchmark corpora

Shubhanshu Mishra, Jana Diesner

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Do tweets from users with similar Twitter characteristics have similar sentiments? What meta-data features of tweets and users correlate with tweet sentiment? In this paper, we address these two questions by analyzing six popular benchmark datasets where tweets are annotated with sentiment labels. We consider user-level as well as tweet-level meta-data features, and identify patterns and correlations of these feature with the log-odds for sentiment classes. We further strengthen our analysis by replicating this set of experiments on recent tweets from users present in our datasets; finding that most of the patterns are consistent across our analysis. Finally, we use our identified meta-data features as features for a sentiment classification algorithm, which results in around 2% increase in F1 score for sentiment classification, compared to text-only classifiers, along with a significant drop in KL-divergence. These results have potential to improve sentiment analysis applications on social media data.

Original languageEnglish (US)
Title of host publicationHT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media
PublisherAssociation for Computing Machinery, Inc
Pages2-10
Number of pages9
ISBN (Electronic)9781450354271
DOIs
StatePublished - Jul 3 2018
Event29th ACM International Conference on Hypertext and Social Media, HT 2018 - Baltimore, United States
Duration: Jul 9 2018Jul 12 2018

Publication series

NameHT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media

Other

Other29th ACM International Conference on Hypertext and Social Media, HT 2018
Country/TerritoryUnited States
CityBaltimore
Period7/9/187/12/18

Keywords

  • Sentiment analysis
  • Social media data
  • Social media meta-data
  • Statistical analysis

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence
  • Human-Computer Interaction
  • Computer Graphics and Computer-Aided Design

Fingerprint

Dive into the research topics of 'Detecting the correlation between sentiment and user-level as well as text-level meta-data from benchmark corpora'. Together they form a unique fingerprint.

Cite this