Histogram sort with sampling

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

To minimize data movement, state-of-the-art parallel sorting algorithms use techniques based on sampling and histogramming to partition keys prior to redistribution. Sampling enables partitioning to be done using a representative subset of the keys, while histogramming enables evaluation and iterative improvement of a given partition. We introduce Histogram sort with sampling (HSS), which combines sampling and iterative histogramming to find high-quality partitions with minimal data movement and high practical performance. Compared to the best known (recently introduced) algorithm for finding these partitions, our algorithm requires a factor of Θ(log(p)/ log log(p)) less communication, and substantially less when compared to standard variants of Sample sort and Histogram sort. We provide a distributed-memory implementation of the proposed algorithm, compare its performance to two existing implementations, and provide a brief application study showing benefit of the new algorithm.

Original languageEnglish (US)
Title of host publicationSPAA 2019 - Proceedings of the 31st ACM Symposium on Parallelism in Algorithms and Architectures
PublisherAssociation for Computing Machinery
Pages201-212
Number of pages12
ISBN (Electronic)9781450361842
DOIs
StatePublished - Jun 17 2019
Event31st ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2019 - Phoenix, United States
Duration: Jun 22 2019Jun 24 2019

Publication series

NameAnnual ACM Symposium on Parallelism in Algorithms and Architectures

Conference

Conference31st ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2019
Country/TerritoryUnited States
CityPhoenix
Period6/22/196/24/19

Keywords

  • Data partitioning
  • Histogramming
  • Parallel sorting
  • Sampling

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Histogram sort with sampling'. Together they form a unique fingerprint.

Cite this