TY - GEN
T1 - Detecting the correlation between sentiment and user-level as well as text-level meta-data from benchmark corpora
AU - Mishra, Shubhanshu
AU - Diesner, Jana
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/7/3
Y1 - 2018/7/3
N2 - Do tweets from users with similar Twitter characteristics have similar sentiments? What meta-data features of tweets and users correlate with tweet sentiment? In this paper, we address these two questions by analyzing six popular benchmark datasets where tweets are annotated with sentiment labels. We consider user-level as well as tweet-level meta-data features, and identify patterns and correlations of these feature with the log-odds for sentiment classes. We further strengthen our analysis by replicating this set of experiments on recent tweets from users present in our datasets; finding that most of the patterns are consistent across our analysis. Finally, we use our identified meta-data features as features for a sentiment classification algorithm, which results in around 2% increase in F1 score for sentiment classification, compared to text-only classifiers, along with a significant drop in KL-divergence. These results have potential to improve sentiment analysis applications on social media data.
AB - Do tweets from users with similar Twitter characteristics have similar sentiments? What meta-data features of tweets and users correlate with tweet sentiment? In this paper, we address these two questions by analyzing six popular benchmark datasets where tweets are annotated with sentiment labels. We consider user-level as well as tweet-level meta-data features, and identify patterns and correlations of these feature with the log-odds for sentiment classes. We further strengthen our analysis by replicating this set of experiments on recent tweets from users present in our datasets; finding that most of the patterns are consistent across our analysis. Finally, we use our identified meta-data features as features for a sentiment classification algorithm, which results in around 2% increase in F1 score for sentiment classification, compared to text-only classifiers, along with a significant drop in KL-divergence. These results have potential to improve sentiment analysis applications on social media data.
KW - Sentiment analysis
KW - Social media data
KW - Social media meta-data
KW - Statistical analysis
UR - http://www.scopus.com/inward/record.url?scp=85051507909&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85051507909&partnerID=8YFLogxK
U2 - 10.1145/3209542.3209562
DO - 10.1145/3209542.3209562
M3 - Conference contribution
AN - SCOPUS:85051507909
T3 - HT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media
SP - 2
EP - 10
BT - HT 2018 - Proceedings of the 29th ACM Conference on Hypertext and Social Media
PB - Association for Computing Machinery, Inc
T2 - 29th ACM International Conference on Hypertext and Social Media, HT 2018
Y2 - 9 July 2018 through 12 July 2018
ER -