An online risk index for the cross-sectional prediction of new HIV chlamydia, and gonorrhea diagnoses across U.S. counties and across years

Man Pui Sally Chan, Sophie Lohmann, Alex Morales, Chengxiang Zhai, Lyle Ungar, David R. Holtgrave, Dolores Albarracin

Research output: Contribution to journalArticle

Abstract

The present study evaluated the potential use of Twitter data for providing risk indices of STIs. We developed online risk indices (ORIs) based on tweets to predict new HIV, gonorrhea, and chlamydia diagnoses, across U.S. counties and across 5 years. We analyzed over one hundred million tweets from 2009 to 2013 using open-vocabulary techniques and estimated the ORIs for a particular year by entering tweets from the same year into multiple semantic models (one for each year). The ORIs were moderately to strongly associated with the actual rates (.35 < rs < .68 for 93% of models), both nationwide and when applied to single states (California, Florida, and New York). Later models were slightly better than older ones at predicting gonorrhea and chlamydia, but not at predicting HIV. The proposed technique using free social media data provides signals of community health at a high temporal and spatial resolution.

Original languageEnglish (US)
Pages (from-to)2322-2333
Number of pages12
JournalAIDS and Behavior
Volume22
Issue number7
DOIs
StatePublished - Jul 1 2018

Fingerprint

Chlamydia
Gonorrhea
HIV
Social Media
Vocabulary
Sexually Transmitted Diseases
Semantics
Health

Keywords

  • Big data
  • Chlamydia
  • Gonorrhea
  • HIV
  • Social media

ASJC Scopus subject areas

  • Social Psychology
  • Public Health, Environmental and Occupational Health
  • Infectious Diseases

Cite this

An online risk index for the cross-sectional prediction of new HIV chlamydia, and gonorrhea diagnoses across U.S. counties and across years. / Chan, Man Pui Sally; Lohmann, Sophie; Morales, Alex; Zhai, Chengxiang; Ungar, Lyle; Holtgrave, David R.; Albarracin, Dolores.

In: AIDS and Behavior, Vol. 22, No. 7, 01.07.2018, p. 2322-2333.

Research output: Contribution to journalArticle

Chan, Man Pui Sally ; Lohmann, Sophie ; Morales, Alex ; Zhai, Chengxiang ; Ungar, Lyle ; Holtgrave, David R. ; Albarracin, Dolores. / An online risk index for the cross-sectional prediction of new HIV chlamydia, and gonorrhea diagnoses across U.S. counties and across years. In: AIDS and Behavior. 2018 ; Vol. 22, No. 7. pp. 2322-2333.
@article{78a4ea391f6340db9495f729e2835aa7,
title = "An online risk index for the cross-sectional prediction of new HIV chlamydia, and gonorrhea diagnoses across U.S. counties and across years",
abstract = "The present study evaluated the potential use of Twitter data for providing risk indices of STIs. We developed online risk indices (ORIs) based on tweets to predict new HIV, gonorrhea, and chlamydia diagnoses, across U.S. counties and across 5 years. We analyzed over one hundred million tweets from 2009 to 2013 using open-vocabulary techniques and estimated the ORIs for a particular year by entering tweets from the same year into multiple semantic models (one for each year). The ORIs were moderately to strongly associated with the actual rates (.35 < rs < .68 for 93{\%} of models), both nationwide and when applied to single states (California, Florida, and New York). Later models were slightly better than older ones at predicting gonorrhea and chlamydia, but not at predicting HIV. The proposed technique using free social media data provides signals of community health at a high temporal and spatial resolution.",
keywords = "Big data, Chlamydia, Gonorrhea, HIV, Social media",
author = "Chan, {Man Pui Sally} and Sophie Lohmann and Alex Morales and Chengxiang Zhai and Lyle Ungar and Holtgrave, {David R.} and Dolores Albarracin",
year = "2018",
month = "7",
day = "1",
doi = "10.1007/s10461-018-2046-0",
language = "English (US)",
volume = "22",
pages = "2322--2333",
journal = "AIDS and Behavior",
issn = "1090-7165",
publisher = "Springer New York",
number = "7",

}

TY - JOUR

T1 - An online risk index for the cross-sectional prediction of new HIV chlamydia, and gonorrhea diagnoses across U.S. counties and across years

AU - Chan, Man Pui Sally

AU - Lohmann, Sophie

AU - Morales, Alex

AU - Zhai, Chengxiang

AU - Ungar, Lyle

AU - Holtgrave, David R.

AU - Albarracin, Dolores

PY - 2018/7/1

Y1 - 2018/7/1

N2 - The present study evaluated the potential use of Twitter data for providing risk indices of STIs. We developed online risk indices (ORIs) based on tweets to predict new HIV, gonorrhea, and chlamydia diagnoses, across U.S. counties and across 5 years. We analyzed over one hundred million tweets from 2009 to 2013 using open-vocabulary techniques and estimated the ORIs for a particular year by entering tweets from the same year into multiple semantic models (one for each year). The ORIs were moderately to strongly associated with the actual rates (.35 < rs < .68 for 93% of models), both nationwide and when applied to single states (California, Florida, and New York). Later models were slightly better than older ones at predicting gonorrhea and chlamydia, but not at predicting HIV. The proposed technique using free social media data provides signals of community health at a high temporal and spatial resolution.

AB - The present study evaluated the potential use of Twitter data for providing risk indices of STIs. We developed online risk indices (ORIs) based on tweets to predict new HIV, gonorrhea, and chlamydia diagnoses, across U.S. counties and across 5 years. We analyzed over one hundred million tweets from 2009 to 2013 using open-vocabulary techniques and estimated the ORIs for a particular year by entering tweets from the same year into multiple semantic models (one for each year). The ORIs were moderately to strongly associated with the actual rates (.35 < rs < .68 for 93% of models), both nationwide and when applied to single states (California, Florida, and New York). Later models were slightly better than older ones at predicting gonorrhea and chlamydia, but not at predicting HIV. The proposed technique using free social media data provides signals of community health at a high temporal and spatial resolution.

KW - Big data

KW - Chlamydia

KW - Gonorrhea

KW - HIV

KW - Social media

UR - http://www.scopus.com/inward/record.url?scp=85041838903&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85041838903&partnerID=8YFLogxK

U2 - 10.1007/s10461-018-2046-0

DO - 10.1007/s10461-018-2046-0

M3 - Article

VL - 22

SP - 2322

EP - 2333

JO - AIDS and Behavior

JF - AIDS and Behavior

SN - 1090-7165

IS - 7

ER -