Deterministic coupon collection and better strong dispersers

Raghu Meka, Omer Reingold, Yuan Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Hashing is one of the main techniques in data processing and algorithm design for very large data sets. While random hash functions satisfy most desirable properties, it is often too expensive to store a fully random hash function. Motivated by this, much attention has been given to designing small families of hash functions suitable for various applications. In this work, we study the question of designing space-efficient hash families H = {h : [U] → [N]} with the natural property of covering: H is said to be covering if any set of Ω(N logN) distinct items from the universe (the coupon-collector limit) are hashed to cover all N bins by most hash functions in H. We give an explicit family H of size poly(N) (which is optimal), so that hash functions in H can be specified efficiently by O(logN) bits. We build covering hash functions by drawing a connection to dispersers, which are quite well studied and have a variety of applications themselves. We in fact need strong dispersers and we give new constructions of strong dispersers which may be of independent interest. Specifically, we construct strong dispersers with optimal entropy loss in the high min-entropy, but very small error (poly(n)/2n for n bit sources) regimes. We also provide a strong disperser construction with constant error but for any min-entropy. Our constructions achieve these by using part of the source to replace seed from previous non-strong constructions in surprising ways. In doing so, we take two of the few constructions of dispersers with parameters better than known extractors and make them strong.

Original languageEnglish (US)
Title of host publicationLeibniz International Proceedings in Informatics, LIPIcs
EditorsKlaus Jansen, Cristopher Moore, Nikhil R. Devanur, Jose D. P. Rolim
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
Pages872-884
Number of pages13
ISBN (Electronic)9783939897743
DOIs
StatePublished - Sep 1 2014
Externally publishedYes
Event17th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2014 and the 18th International Workshop on Randomization and Computation, RANDOM 2014 - Barcelona, Spain
Duration: Sep 4 2014Sep 6 2014

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume28
ISSN (Print)1868-8969

Conference

Conference17th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2014 and the 18th International Workshop on Randomization and Computation, RANDOM 2014
CountrySpain
CityBarcelona
Period9/4/149/6/14

Fingerprint

Hash functions
Entropy
Bins
Seed

Keywords

  • Coupon collection
  • Dispersers
  • Hashing
  • Pseudorandomness
  • Strong dispersers

ASJC Scopus subject areas

  • Software

Cite this

Meka, R., Reingold, O., & Zhou, Y. (2014). Deterministic coupon collection and better strong dispersers. In K. Jansen, C. Moore, N. R. Devanur, & J. D. P. Rolim (Eds.), Leibniz International Proceedings in Informatics, LIPIcs (pp. 872-884). (Leibniz International Proceedings in Informatics, LIPIcs; Vol. 28). Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2014.872

Deterministic coupon collection and better strong dispersers. / Meka, Raghu; Reingold, Omer; Zhou, Yuan.

Leibniz International Proceedings in Informatics, LIPIcs. ed. / Klaus Jansen; Cristopher Moore; Nikhil R. Devanur; Jose D. P. Rolim. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2014. p. 872-884 (Leibniz International Proceedings in Informatics, LIPIcs; Vol. 28).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Meka, R, Reingold, O & Zhou, Y 2014, Deterministic coupon collection and better strong dispersers. in K Jansen, C Moore, NR Devanur & JDP Rolim (eds), Leibniz International Proceedings in Informatics, LIPIcs. Leibniz International Proceedings in Informatics, LIPIcs, vol. 28, Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, pp. 872-884, 17th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2014 and the 18th International Workshop on Randomization and Computation, RANDOM 2014, Barcelona, Spain, 9/4/14. https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2014.872
Meka R, Reingold O, Zhou Y. Deterministic coupon collection and better strong dispersers. In Jansen K, Moore C, Devanur NR, Rolim JDP, editors, Leibniz International Proceedings in Informatics, LIPIcs. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. 2014. p. 872-884. (Leibniz International Proceedings in Informatics, LIPIcs). https://doi.org/10.4230/LIPIcs.APPROX-RANDOM.2014.872
Meka, Raghu ; Reingold, Omer ; Zhou, Yuan. / Deterministic coupon collection and better strong dispersers. Leibniz International Proceedings in Informatics, LIPIcs. editor / Klaus Jansen ; Cristopher Moore ; Nikhil R. Devanur ; Jose D. P. Rolim. Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing, 2014. pp. 872-884 (Leibniz International Proceedings in Informatics, LIPIcs).
@inproceedings{624b0b0f8dc24c9fbde45016995fdddb,
title = "Deterministic coupon collection and better strong dispersers",
abstract = "Hashing is one of the main techniques in data processing and algorithm design for very large data sets. While random hash functions satisfy most desirable properties, it is often too expensive to store a fully random hash function. Motivated by this, much attention has been given to designing small families of hash functions suitable for various applications. In this work, we study the question of designing space-efficient hash families H = {h : [U] → [N]} with the natural property of covering: H is said to be covering if any set of Ω(N logN) distinct items from the universe (the coupon-collector limit) are hashed to cover all N bins by most hash functions in H. We give an explicit family H of size poly(N) (which is optimal), so that hash functions in H can be specified efficiently by O(logN) bits. We build covering hash functions by drawing a connection to dispersers, which are quite well studied and have a variety of applications themselves. We in fact need strong dispersers and we give new constructions of strong dispersers which may be of independent interest. Specifically, we construct strong dispersers with optimal entropy loss in the high min-entropy, but very small error (poly(n)/2n for n bit sources) regimes. We also provide a strong disperser construction with constant error but for any min-entropy. Our constructions achieve these by using part of the source to replace seed from previous non-strong constructions in surprising ways. In doing so, we take two of the few constructions of dispersers with parameters better than known extractors and make them strong.",
keywords = "Coupon collection, Dispersers, Hashing, Pseudorandomness, Strong dispersers",
author = "Raghu Meka and Omer Reingold and Yuan Zhou",
year = "2014",
month = "9",
day = "1",
doi = "10.4230/LIPIcs.APPROX-RANDOM.2014.872",
language = "English (US)",
series = "Leibniz International Proceedings in Informatics, LIPIcs",
publisher = "Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing",
pages = "872--884",
editor = "Klaus Jansen and Cristopher Moore and Devanur, {Nikhil R.} and Rolim, {Jose D. P.}",
booktitle = "Leibniz International Proceedings in Informatics, LIPIcs",
address = "Germany",

}

TY - GEN

T1 - Deterministic coupon collection and better strong dispersers

AU - Meka, Raghu

AU - Reingold, Omer

AU - Zhou, Yuan

PY - 2014/9/1

Y1 - 2014/9/1

N2 - Hashing is one of the main techniques in data processing and algorithm design for very large data sets. While random hash functions satisfy most desirable properties, it is often too expensive to store a fully random hash function. Motivated by this, much attention has been given to designing small families of hash functions suitable for various applications. In this work, we study the question of designing space-efficient hash families H = {h : [U] → [N]} with the natural property of covering: H is said to be covering if any set of Ω(N logN) distinct items from the universe (the coupon-collector limit) are hashed to cover all N bins by most hash functions in H. We give an explicit family H of size poly(N) (which is optimal), so that hash functions in H can be specified efficiently by O(logN) bits. We build covering hash functions by drawing a connection to dispersers, which are quite well studied and have a variety of applications themselves. We in fact need strong dispersers and we give new constructions of strong dispersers which may be of independent interest. Specifically, we construct strong dispersers with optimal entropy loss in the high min-entropy, but very small error (poly(n)/2n for n bit sources) regimes. We also provide a strong disperser construction with constant error but for any min-entropy. Our constructions achieve these by using part of the source to replace seed from previous non-strong constructions in surprising ways. In doing so, we take two of the few constructions of dispersers with parameters better than known extractors and make them strong.

AB - Hashing is one of the main techniques in data processing and algorithm design for very large data sets. While random hash functions satisfy most desirable properties, it is often too expensive to store a fully random hash function. Motivated by this, much attention has been given to designing small families of hash functions suitable for various applications. In this work, we study the question of designing space-efficient hash families H = {h : [U] → [N]} with the natural property of covering: H is said to be covering if any set of Ω(N logN) distinct items from the universe (the coupon-collector limit) are hashed to cover all N bins by most hash functions in H. We give an explicit family H of size poly(N) (which is optimal), so that hash functions in H can be specified efficiently by O(logN) bits. We build covering hash functions by drawing a connection to dispersers, which are quite well studied and have a variety of applications themselves. We in fact need strong dispersers and we give new constructions of strong dispersers which may be of independent interest. Specifically, we construct strong dispersers with optimal entropy loss in the high min-entropy, but very small error (poly(n)/2n for n bit sources) regimes. We also provide a strong disperser construction with constant error but for any min-entropy. Our constructions achieve these by using part of the source to replace seed from previous non-strong constructions in surprising ways. In doing so, we take two of the few constructions of dispersers with parameters better than known extractors and make them strong.

KW - Coupon collection

KW - Dispersers

KW - Hashing

KW - Pseudorandomness

KW - Strong dispersers

UR - http://www.scopus.com/inward/record.url?scp=84920123895&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84920123895&partnerID=8YFLogxK

U2 - 10.4230/LIPIcs.APPROX-RANDOM.2014.872

DO - 10.4230/LIPIcs.APPROX-RANDOM.2014.872

M3 - Conference contribution

AN - SCOPUS:84920123895

T3 - Leibniz International Proceedings in Informatics, LIPIcs

SP - 872

EP - 884

BT - Leibniz International Proceedings in Informatics, LIPIcs

A2 - Jansen, Klaus

A2 - Moore, Cristopher

A2 - Devanur, Nikhil R.

A2 - Rolim, Jose D. P.

PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

ER -