TY - JOUR
T1 - Gbase
T2 - An efficient analysis platform for large graphs
AU - Kang, U.
AU - Tong, Hanghang
AU - Sun, Jimeng
AU - Lin, Ching Yung
AU - Faloutsos, Christos
PY - 2012/10
Y1 - 2012/10
N2 - Graphs appear in numerous applications including cyber security, the Internet, social networks, protein networks, recommendation systems, citation networks, and many more. Graphs with millions or even billions of nodes and edges are common-place. How to store such large graphs efficiently? What are the core operations/queries on those graph? How to answer the graph queries quickly? We propose Gbase, an efficient analysis platform for large graphs. The key novelties lie in (1) our storage and compression scheme for a parallel, distributed settings and (2) the carefully chosen graph operations and their efficient implementations. We designed and implemented an instance of Gbase using Mapreduce/Hadoop. Gbase provides a parallel indexing mechanism for graph operations that both saves storage space, as well as accelerates query responses. We run numerous experiments on real and synthetic graphs, spanning billions of nodes and edges, and we show that our proposed Gbase is indeed fast, scalable, and nimble, with significant savings in space and time.
AB - Graphs appear in numerous applications including cyber security, the Internet, social networks, protein networks, recommendation systems, citation networks, and many more. Graphs with millions or even billions of nodes and edges are common-place. How to store such large graphs efficiently? What are the core operations/queries on those graph? How to answer the graph queries quickly? We propose Gbase, an efficient analysis platform for large graphs. The key novelties lie in (1) our storage and compression scheme for a parallel, distributed settings and (2) the carefully chosen graph operations and their efficient implementations. We designed and implemented an instance of Gbase using Mapreduce/Hadoop. Gbase provides a parallel indexing mechanism for graph operations that both saves storage space, as well as accelerates query responses. We run numerous experiments on real and synthetic graphs, spanning billions of nodes and edges, and we show that our proposed Gbase is indeed fast, scalable, and nimble, with significant savings in space and time.
KW - Compression
KW - Distributed computing
KW - Graph
KW - Indexing
UR - http://www.scopus.com/inward/record.url?scp=84866450568&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84866450568&partnerID=8YFLogxK
U2 - 10.1007/s00778-012-0283-9
DO - 10.1007/s00778-012-0283-9
M3 - Article
AN - SCOPUS:84866450568
SN - 1066-8888
VL - 21
SP - 637
EP - 650
JO - VLDB Journal
JF - VLDB Journal
IS - 5
ER -