Geographical topic discovery and comparison

Zhijun Yin, Liangliang Cao, Jiawei Han, Chengxiang Zhai, Thomas Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper studies the problem of discovering and comparing geographical topics from GPS-associated documents. GPS-associated documents become popular with the pervasiveness of location-acquisition technologies. For example, in Flickr, the geo-tagged photos are associated with tags and GPS locations. In Twitter, the locations of the tweets can be identified by the GPS locations from smart phones. Many interesting concepts, including cultures, scenes, and product sales, correspond to specialized geographical distributions. In this paper, we are interested in two questions: (1) how to discover different topics of interests that are coherent in geographical regions? (2) how to compare several topics across different geographical locations? To answer these questions, this paper proposes and compares three ways of modeling geographical topics: location-driven model, text-driven model, and a novel joint model called LGTA (Latent Geographical Topic Analysis) that combines location and text. To make a fair comparison, we collect several representative datasets from Flickr website including Landscape, Activity, Manhattan, National park, Festival, Car, and Food. The results show that the first two methods work in some datasets but fail in others. LGTA works well in all these datasets at not only finding regions of interests but also providing effective comparisons of the topics across different locations. The results confirm our hypothesis that the geographical distributions can help modeling topics, while topics provide important cues to group different geographical regions.

Original languageEnglish (US)
Title of host publicationProceedings of the 20th International Conference on World Wide Web, WWW 2011
Pages247-256
Number of pages10
DOIs
StatePublished - 2011
Event20th International Conference on World Wide Web, WWW 2011 - Hyderabad, India
Duration: Mar 28 2011Apr 1 2011

Publication series

NameProceedings of the 20th International Conference on World Wide Web, WWW 2011

Other

Other20th International Conference on World Wide Web, WWW 2011
Country/TerritoryIndia
CityHyderabad
Period3/28/114/1/11

Keywords

  • Geographical topics
  • Topic comparison
  • Topic modeling

ASJC Scopus subject areas

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Geographical topic discovery and comparison'. Together they form a unique fingerprint.

Cite this