Scaling techniques for massive scale-free graphs in distributed (external) memory

Roger Pearce, Maya Gokhale, Nancy M. Amato

Research output: Contribution to conferencePaperpeer-review

Abstract

We present techniques to process large scale-free graphs in distributed memory. Our aim is to scale to trillions of edges, and our research is targeted at leadership class supercomputers and clusters with local non-volatile memory, e.g., NAND Flash. We apply an edge list partitioning technique, designed to accommodate high-degree vertices (hubs) that create scaling challenges when processing scale-free graphs. In addition to partitioning hubs, we use ghost vertices to represent the hubs to reduce communication hotspots. We present a scaling study with three important graph algorithms: Breadth-First Search (BFS), K-Core decomposition, and Triangle Counting. We also demonstrate scalability on BG/P Intrepid by comparing to best known Graph500 results. We show results on two clusters with local NVRAM storage that are capable of traversing trillion-edge scale-free graphs. By leveraging node-local NAND Flash, our approach can process thirty-two times larger datasets with only a 39% performance degradation in Traversed Edges Per Second (TEPS).

Original languageEnglish (US)
Pages825-836
Number of pages12
DOIs
StatePublished - Oct 7 2013
Externally publishedYes
Event27th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2013 - Boston, MA, United States
Duration: May 20 2013May 24 2013

Other

Other27th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2013
CountryUnited States
CityBoston, MA
Period5/20/135/24/13

Keywords

  • big data
  • distributed computing
  • graph algorithms
  • parallel algorithms

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'Scaling techniques for massive scale-free graphs in distributed (external) memory'. Together they form a unique fingerprint.

Cite this