Multi-Attribute Topic Feature Construction for Social Media-based Prediction

Alex Morales, Nupoor Gandhi, Man Pui Sally Chan, Sophie Lohmann, Travis Sanchez, Kathleen A. Brady, Lyle Ungar, Dolores Albarracín, Chengxiang Zhai

Research output: Chapter in Book/Report/Conference proceedingConference contribution


The effectiveness of social media-based prediction highly depends on whether we can construct effective content-based features based on social media text data. Features constructed based on topics learned using a topic model are very attractive due to their expressiveness in semantic representation and accommodation of inexact matching of semantically related words. We develop a novel general framework for constructing multi-attribute topic features using multi-views of the text data defined according to metadata attributes and study their effectiveness for a text-based prediction task. Furthermore we propose and study multiple weighting strategies to align text-based features and prediction outcomes. We evaluate the proposed method on a Twitter corpus of over 100 million tweets collected over a seven year period in 2009-2015 to predict human immunodeficiency virus (HIV) new diagnosis and other sexually transmitted infections (STIs) new diagnosis in the United States at the zipcode-level and county-level resolutions. The results show that feature representations based on attributes such as authors, locations, and hashtags are generally more effective than the conventional topic feature representation.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE International Conference on Big Data, Big Data 2018
EditorsNaoki Abe, Huan Liu, Calton Pu, Xiaohua Hu, Nesreen Ahmed, Mu Qiao, Yang Song, Donald Kossmann, Bing Liu, Kisung Lee, Jiliang Tang, Jingrui He, Jeffrey Saltz
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages6
ISBN (Electronic)9781538650356
StatePublished - Jul 2 2018
Event2018 IEEE International Conference on Big Data, Big Data 2018 - Seattle, United States
Duration: Dec 10 2018Dec 13 2018

Publication series

NameProceedings - 2018 IEEE International Conference on Big Data, Big Data 2018


Conference2018 IEEE International Conference on Big Data, Big Data 2018
Country/TerritoryUnited States

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems


Dive into the research topics of 'Multi-Attribute Topic Feature Construction for Social Media-based Prediction'. Together they form a unique fingerprint.

Cite this