TY - GEN
T1 - Fine Grained Categorization of Drug Usage Tweets
AU - Dey, Priyanka
AU - Zhai, Cheng Xiang
N1 - Acknowledgments. This material is based upon work supported by the National Science Foundation under Grant No. 1801652 and by the National Institutes of Health under Grant 1 R56 AI114501-01A1.
PY - 2022
Y1 - 2022
N2 - Drug misuse and overdose has plagued the United States over the past decades and has severely impacted several communities and families. Often, it is difficult for drug users to get the assistance they need and thus many usage cases remain undetected until it is too late. With the booming age of social media, many users often prefer to discuss their emotions through virtual environments where they can also meet others dealing with similar problems. The widespread use of social media sites creates interesting new opportunities to apply NLP techniques to analyze content and potentially help those drug users (e.g., early detection and intervention). To tap into such opportunities, we study categorization of tweets about drug usage into fine-grained categories. To facilitate the study of the proposed new problem, we create a new dataset and use this data to study the effectiveness of multiple representative categorization methods. We further analyze errors made by these methods and explore new features to improve them. We find that a new feature based on tweet tone is quite useful in improving classification scores. We further explore possible downstream applications based on this classification system and provide a set of preliminary findings.
AB - Drug misuse and overdose has plagued the United States over the past decades and has severely impacted several communities and families. Often, it is difficult for drug users to get the assistance they need and thus many usage cases remain undetected until it is too late. With the booming age of social media, many users often prefer to discuss their emotions through virtual environments where they can also meet others dealing with similar problems. The widespread use of social media sites creates interesting new opportunities to apply NLP techniques to analyze content and potentially help those drug users (e.g., early detection and intervention). To tap into such opportunities, we study categorization of tweets about drug usage into fine-grained categories. To facilitate the study of the proposed new problem, we create a new dataset and use this data to study the effectiveness of multiple representative categorization methods. We further analyze errors made by these methods and explore new features to improve them. We find that a new feature based on tweet tone is quite useful in improving classification scores. We further explore possible downstream applications based on this classification system and provide a set of preliminary findings.
KW - Categorization
KW - Drug usage
KW - Public health
KW - Social media analytics
UR - http://www.scopus.com/inward/record.url?scp=85133032569&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85133032569&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-05061-9_19
DO - 10.1007/978-3-031-05061-9_19
M3 - Conference contribution
AN - SCOPUS:85133032569
SN - 9783031050602
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 267
EP - 280
BT - Social Computing and Social Media
A2 - Meiselwitz, Gabriele
PB - Springer
T2 - 14th International Conference on Social Computing and Social Media, SCSM 2022 Held as Part of the 24th HCI International Conference, HCII 2022
Y2 - 26 June 2022 through 1 July 2022
ER -