TY - JOUR
T1 - Breaking the Bandwidth Barrier
T2 - Geometrical Adaptive Entropy Estimation
AU - Gao, Weihao
AU - Oh, Sewoong
AU - Viswanath, Pramod
N1 - Manuscript received October 9, 2016; revised August 17, 2017 and February 13, 2018; accepted February 17, 2018. Date of publication March 7, 2018; date of current version April 19, 2018. This work was supported by in part by NSF SaTC under Award CNS-1527754, in part by NSF CISE under Award CCF-1553452, and in part by NSF CISE under Award CCF-1617745. This paper was presented in part at 2016 Neural Information Processing Systems.
PY - 2018/5
Y1 - 2018/5
N2 - Estimators of information theoretic measures, such as entropy and mutual information, are a basic workhorse for many downstream applications in modern data science. State-of-the-art approaches have been either geometric [nearest neighbor (NN)-based] or kernel-based (with a globally chosen bandwidth). In this paper, we combine both these approaches to design new estimators of entropy and mutual information that outperform the state-of-the-art methods. Our estimator uses local bandwidth choices of k -NN distances with a finite k , independent of the sample size. Such a local and data dependent choice ameliorates boundary bias and improves performance in practice, but the bandwidth is vanishing at a fast rate, leading to a non-vanishing bias. We show that the asymptotic bias of the proposed estimator is universal; it is independent of the underlying distribution. Hence, it can be precomputed and subtracted from the estimate. As a byproduct, we obtain a unified way of obtaining both the kernel and NN estimators. The corresponding theoretical contribution relating the asymptotic geometry of nearest neighbors to order statistics is of independent mathematical interest.
AB - Estimators of information theoretic measures, such as entropy and mutual information, are a basic workhorse for many downstream applications in modern data science. State-of-the-art approaches have been either geometric [nearest neighbor (NN)-based] or kernel-based (with a globally chosen bandwidth). In this paper, we combine both these approaches to design new estimators of entropy and mutual information that outperform the state-of-the-art methods. Our estimator uses local bandwidth choices of k -NN distances with a finite k , independent of the sample size. Such a local and data dependent choice ameliorates boundary bias and improves performance in practice, but the bandwidth is vanishing at a fast rate, leading to a non-vanishing bias. We show that the asymptotic bias of the proposed estimator is universal; it is independent of the underlying distribution. Hence, it can be precomputed and subtracted from the estimate. As a byproduct, we obtain a unified way of obtaining both the kernel and NN estimators. The corresponding theoretical contribution relating the asymptotic geometry of nearest neighbors to order statistics is of independent mathematical interest.
KW - Information theory
KW - density estimation robust algorithm
KW - information entropy
KW - mutual information
KW - nearest neighbor methods
UR - https://www.scopus.com/pages/publications/85043369477
UR - https://www.scopus.com/inward/citedby.url?scp=85043369477&partnerID=8YFLogxK
U2 - 10.1109/TIT.2018.2810313
DO - 10.1109/TIT.2018.2810313
M3 - Article
AN - SCOPUS:85043369477
SN - 0018-9448
VL - 64
SP - 3313
EP - 3330
JO - IEEE Transactions on Information Theory
JF - IEEE Transactions on Information Theory
IS - 5
ER -