TY - GEN
T1 - Telling apart tweets associated with controversial versus noncontroversial topics
AU - Addawood, Aseel
AU - Rezapour, Rezvaneh
AU - Abdar, Omid
AU - Diesner, Jana
N1 - Publisher Copyright:
© 2017 Association for Computational Linguistics.
PY - 2017
Y1 - 2017
N2 - In this paper, we evaluate the predictability of tweets associated with controversial versus non-controversial topics. As a first step, we crowd-sourced the scoring of a predefined set of topics on a Likert scale from non-controversial to controversial. Our feature set entails and goes beyond sentiment features, e.g., by leveraging empathic language and other features that have been previously used, but are new for this particular study. We find focusing on the structural characteristics of tweets to be beneficial for this task. Using a combination of emphatic, language-specific, and Twitter-specific features for supervised learning resulted in 87% accuracy (F1) for cross-validation of the training set and 63.4% accuracy when using the test set. Our analysis shows that features specific to Twitter or social media in general are more prevalent in tweets on controversial topics than in non-controversial ones. To test the premise of the paper, we conducted two additional sets of experiments, which led to mixed results. This finding will inform our future investigations into the relationship between language use on social media and the perceived controversiality of topics.
AB - In this paper, we evaluate the predictability of tweets associated with controversial versus non-controversial topics. As a first step, we crowd-sourced the scoring of a predefined set of topics on a Likert scale from non-controversial to controversial. Our feature set entails and goes beyond sentiment features, e.g., by leveraging empathic language and other features that have been previously used, but are new for this particular study. We find focusing on the structural characteristics of tweets to be beneficial for this task. Using a combination of emphatic, language-specific, and Twitter-specific features for supervised learning resulted in 87% accuracy (F1) for cross-validation of the training set and 63.4% accuracy when using the test set. Our analysis shows that features specific to Twitter or social media in general are more prevalent in tweets on controversial topics than in non-controversial ones. To test the premise of the paper, we conducted two additional sets of experiments, which led to mixed results. This finding will inform our future investigations into the relationship between language use on social media and the perceived controversiality of topics.
UR - http://www.scopus.com/inward/record.url?scp=85066416470&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85066416470&partnerID=8YFLogxK
U2 - 10.18653/v1/W17-2905
DO - 10.18653/v1/W17-2905
M3 - Conference contribution
AN - SCOPUS:85066416470
T3 - Proceedings of the 2nd Workshop on Natural Language Processing and Computational Social Science, NLP+CSS 2017 at the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017
SP - 32
EP - 41
BT - Proceedings of the 2nd Workshop on Natural Language Processing and Computational Social Science, NLP+CSS 2017 at the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017
A2 - Hovy, Dirk
A2 - Volkova, Svitlana
A2 - Bamman, David
A2 - Jurgens, David
A2 - O�Connor, Brendan
A2 - Tsur, Oren
A2 - Dogruoz, A. Seza
PB - Association for Computational Linguistics (ACL)
T2 - 2nd Workshop on Natural Language Processing and Computational Social Science, NLP+CSS 2017 at the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017
Y2 - 3 August 2017
ER -