STREAMCUBE: Hierarchical spatio-temporal hashtag clustering for event exploration over the Twitter stream

Wei Feng, Chao Zhang, Wei Zhang, Jiawei Han, Jianyong Wang, Charu Aggarwal, Jianbin Huang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

What is happening around the world? When and where? Mining the geo-tagged Twitter stream makes it possible to answer the above questions in real-time. Although a single tweet can be short and noisy, proper aggregations of tweets can provide meaningful results. In this paper, we focus on hierarchical spatio-temporal hashtag clustering techniques. Our system has the following features: (1) Exploring events (hashtag clusters) with different space granularity. Users can zoom in and out on maps to find out what is happening in a particular area. (2) Exploring events with different time granularity. Users can choose to see what is happening today or in the past week. (3) Efficient single-pass algorithm for event identification, which provides human-readable hashtag clusters. (4) Efficient event ranking which aims to find burst events and localized events given a particular region and time frame. To support aggregation with different space and time granularity, we propose a data structure called STREAMCUBE, which is an extension of the data cube structure from the database community with spatial and temporal hierarchy. To achieve high scalability, we propose a divide-and-conquer method to construct the STREAMCUBE. To support flexible event ranking with different weights, we proposed a top-k based index. Different efficient methods are used to speed up event similarity computations. Finally, we have conducted extensive experiments on a real twitter data. Experimental results show that our framework can provide meaningful results with high scalability.

Original languageEnglish (US)
Title of host publication2015 IEEE 31st International Conference on Data Engineering, ICDE 2015
PublisherIEEE Computer Society
Pages1561-1572
Number of pages12
ISBN (Electronic)9781479979639
DOIs
StatePublished - May 26 2015
Event2015 31st IEEE International Conference on Data Engineering, ICDE 2015 - Seoul, Korea, Republic of
Duration: Apr 13 2015Apr 17 2015

Publication series

NameProceedings - International Conference on Data Engineering
Volume2015-May
ISSN (Print)1084-4627

Other

Other2015 31st IEEE International Conference on Data Engineering, ICDE 2015
Country/TerritoryKorea, Republic of
CitySeoul
Period4/13/154/17/15

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Fingerprint

Dive into the research topics of 'STREAMCUBE: Hierarchical spatio-temporal hashtag clustering for event exploration over the Twitter stream'. Together they form a unique fingerprint.

Cite this