TY - GEN
T1 - Social trove
T2 - 12th IEEE International Conference on Autonomic Computing, ICAC 2015
AU - Al Amin, Md Tanvir
AU - Li, Shen
AU - Rahman, Muntasir Raihan
AU - Seetharamu, Panindra Tumkur
AU - Wang, Shiguang
AU - Abdelzaher, Tarek
AU - Gupta, Indranil
AU - Srivatsa, Mudhakar
AU - Ganti, Raghu
AU - Ahmed, Reaz
AU - Le, Hieu
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/9/14
Y1 - 2015/9/14
N2 - The increasing availability of smartphones, cameras, and wearables with instant data sharing capabilities, and the exploitation of social networks for information broadcast, heralds a future of real-time information overload. With the growing excess of worldwide streaming data, such as images, geotags, text annotations, and sensory measurements, an increasingly common service will become one of data summarization. The objective of such a service will be to obtain a representative sampling of large data streams at a configurable granularity, in real-time, for subsequent consumption by a range of data-centric applications. This paper describes a general-purpose self-summarizing storage service, called Social Trove, for social sensing applications. The service summarizes data streams from human sources, or sensors in their possession, by hierarchically clustering received information in accordance with an application-specific distance metric. It then serves a sampling of produced clusters at a configurable granularity in response to application queries. While Social Trove is a general service, we illustrate its functionality and evaluate it in the specific context of workloads collected from Twitter. Results show that Social Trove supports a high query throughput, while maintaining a low access latency to the produced real-time application-specific data summaries. As a specific application case-study, we implement a fact-finding service on top of Social Trove.
AB - The increasing availability of smartphones, cameras, and wearables with instant data sharing capabilities, and the exploitation of social networks for information broadcast, heralds a future of real-time information overload. With the growing excess of worldwide streaming data, such as images, geotags, text annotations, and sensory measurements, an increasingly common service will become one of data summarization. The objective of such a service will be to obtain a representative sampling of large data streams at a configurable granularity, in real-time, for subsequent consumption by a range of data-centric applications. This paper describes a general-purpose self-summarizing storage service, called Social Trove, for social sensing applications. The service summarizes data streams from human sources, or sensors in their possession, by hierarchically clustering received information in accordance with an application-specific distance metric. It then serves a sampling of produced clusters at a configurable granularity in response to application queries. While Social Trove is a general service, we illustrate its functionality and evaluate it in the specific context of workloads collected from Twitter. Results show that Social Trove supports a high query throughput, while maintaining a low access latency to the produced real-time application-specific data summaries. As a specific application case-study, we implement a fact-finding service on top of Social Trove.
KW - Clustering
KW - Social Sensing
KW - Storage
KW - Summarization
UR - http://www.scopus.com/inward/record.url?scp=84961768118&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84961768118&partnerID=8YFLogxK
U2 - 10.1109/ICAC.2015.47
DO - 10.1109/ICAC.2015.47
M3 - Conference contribution
AN - SCOPUS:84961768118
T3 - Proceedings - IEEE International Conference on Autonomic Computing, ICAC 2015
SP - 41
EP - 50
BT - Proceedings - IEEE International Conference on Autonomic Computing, ICAC 2015
A2 - Lalanda, Philippe
A2 - Kounev, Samuel
A2 - Diaconescu, Ada
A2 - Cherkasova, Lucy
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 7 July 2015 through 10 July 2015
ER -