Abstract
Consider the problem of estimating the Shannon entropy of a distribution over k elements from n independent samples. We show that the minimax mean-square error is within the universal multiplicative constant factors of (k/n log k)2+ log2 k/n if n exceeds a constant factor of (k/log k) otherwise, there exists no consistent estimator. This refines the recent result of Valiant and Valiant that the minimal sample size for consistent entropy estimation scales according to Θ (k/logk). The apparatus of the best polynomial approximation plays a key role in both the construction of optimal estimators and, by a duality argument, the minimax lower bound.
Original language | English (US) |
---|---|
Article number | 7444171 |
Pages (from-to) | 3702-3720 |
Number of pages | 19 |
Journal | IEEE Transactions on Information Theory |
Volume | 62 |
Issue number | 6 |
DOIs | |
State | Published - Jun 2016 |
Keywords
- Entropy estimation
- best polynomial approximation
- functional estimation
- high-dimensional statistics
- large alphabet
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Library and Information Sciences