Estimating mutual information for discrete-continuous mixtures

Weihao Gao, Sreeram Kannan, Sewoong Oh, Pramod Viswanath

Research output: Contribution to journalConference article

Abstract

Estimation of mutual information from observed samples is a basic primitive in machine learning, useful in several learning tasks including correlation mining, information bottleneck, Chow-Liu tree, and conditional independence testing in (causal) graphical models. While mutual information is a quantity well-defined for general probability spaces, estimators have been developed only in the special case of discrete or continuous pairs of random variables. Most of these estimators operate using the 3H-principle, i.e., by calculating the three (differential) entropies of X, Y and the pair (X, Y). However, in general mixture spaces, such individual entropies are not well defined, even though mutual information is. In this paper, we develop a novel estimator for estimating mutual information in discrete-continuous mixtures. We prove the consistency of this estimator theoretically as well as demonstrate its excellent empirical performance. This problem is relevant in a wide-array of applications, where some variables are discrete, some continuous, and others are a mixture between continuous and discrete components.

Original languageEnglish (US)
Pages (from-to)5987-5998
Number of pages12
JournalAdvances in Neural Information Processing Systems
Volume2017-December
StatePublished - Jan 1 2017
Event31st Annual Conference on Neural Information Processing Systems, NIPS 2017 - Long Beach, United States
Duration: Dec 4 2017Dec 9 2017

Fingerprint

Entropy
Random variables
Learning systems
Testing

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Cite this

Estimating mutual information for discrete-continuous mixtures. / Gao, Weihao; Kannan, Sreeram; Oh, Sewoong; Viswanath, Pramod.

In: Advances in Neural Information Processing Systems, Vol. 2017-December, 01.01.2017, p. 5987-5998.

Research output: Contribution to journalConference article

Gao, Weihao ; Kannan, Sreeram ; Oh, Sewoong ; Viswanath, Pramod. / Estimating mutual information for discrete-continuous mixtures. In: Advances in Neural Information Processing Systems. 2017 ; Vol. 2017-December. pp. 5987-5998.
@article{7ba003fc801041a1b90d9e8e3629c363,
title = "Estimating mutual information for discrete-continuous mixtures",
abstract = "Estimation of mutual information from observed samples is a basic primitive in machine learning, useful in several learning tasks including correlation mining, information bottleneck, Chow-Liu tree, and conditional independence testing in (causal) graphical models. While mutual information is a quantity well-defined for general probability spaces, estimators have been developed only in the special case of discrete or continuous pairs of random variables. Most of these estimators operate using the 3H-principle, i.e., by calculating the three (differential) entropies of X, Y and the pair (X, Y). However, in general mixture spaces, such individual entropies are not well defined, even though mutual information is. In this paper, we develop a novel estimator for estimating mutual information in discrete-continuous mixtures. We prove the consistency of this estimator theoretically as well as demonstrate its excellent empirical performance. This problem is relevant in a wide-array of applications, where some variables are discrete, some continuous, and others are a mixture between continuous and discrete components.",
author = "Weihao Gao and Sreeram Kannan and Sewoong Oh and Pramod Viswanath",
year = "2017",
month = "1",
day = "1",
language = "English (US)",
volume = "2017-December",
pages = "5987--5998",
journal = "Advances in Neural Information Processing Systems",
issn = "1049-5258",

}

TY - JOUR

T1 - Estimating mutual information for discrete-continuous mixtures

AU - Gao, Weihao

AU - Kannan, Sreeram

AU - Oh, Sewoong

AU - Viswanath, Pramod

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Estimation of mutual information from observed samples is a basic primitive in machine learning, useful in several learning tasks including correlation mining, information bottleneck, Chow-Liu tree, and conditional independence testing in (causal) graphical models. While mutual information is a quantity well-defined for general probability spaces, estimators have been developed only in the special case of discrete or continuous pairs of random variables. Most of these estimators operate using the 3H-principle, i.e., by calculating the three (differential) entropies of X, Y and the pair (X, Y). However, in general mixture spaces, such individual entropies are not well defined, even though mutual information is. In this paper, we develop a novel estimator for estimating mutual information in discrete-continuous mixtures. We prove the consistency of this estimator theoretically as well as demonstrate its excellent empirical performance. This problem is relevant in a wide-array of applications, where some variables are discrete, some continuous, and others are a mixture between continuous and discrete components.

AB - Estimation of mutual information from observed samples is a basic primitive in machine learning, useful in several learning tasks including correlation mining, information bottleneck, Chow-Liu tree, and conditional independence testing in (causal) graphical models. While mutual information is a quantity well-defined for general probability spaces, estimators have been developed only in the special case of discrete or continuous pairs of random variables. Most of these estimators operate using the 3H-principle, i.e., by calculating the three (differential) entropies of X, Y and the pair (X, Y). However, in general mixture spaces, such individual entropies are not well defined, even though mutual information is. In this paper, we develop a novel estimator for estimating mutual information in discrete-continuous mixtures. We prove the consistency of this estimator theoretically as well as demonstrate its excellent empirical performance. This problem is relevant in a wide-array of applications, where some variables are discrete, some continuous, and others are a mixture between continuous and discrete components.

UR - http://www.scopus.com/inward/record.url?scp=85042380801&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85042380801&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85042380801

VL - 2017-December

SP - 5987

EP - 5998

JO - Advances in Neural Information Processing Systems

JF - Advances in Neural Information Processing Systems

SN - 1049-5258

ER -