TY - JOUR
T1 - Neural methods for point-wise dependency estimation
AU - Tsai, Yao Hung Hubert
AU - Zhao, Han
AU - Yamada, Makoto
AU - Morency, Louis Philippe
AU - Salakhutdinov, Ruslan
N1 - Funding Information:
This work was supported in part by the DARPA grants FA875018C0150 HR00111990016, NSF IIS1763562, NSF Awards #1750439 #1722822, National Institutes of Health, and Apple. We would also like to acknowledge NVIDIA’s GPU support.
Publisher Copyright:
© 2020 Neural information processing systems foundation. All rights reserved.
PY - 2020
Y1 - 2020
N2 - Since its inception, the neural estimation of mutual information (MI) has demonstrated the empirical success of modeling expected dependency between high-dimensional random variables. However, MI is an aggregate statistic and cannot be used to measure point-wise dependency between different events. In this work, instead of estimating the expected dependency, we focus on estimating point-wise dependency (PD), which quantitatively measures how likely two outcomes co-occur. We show that we can naturally obtain PD when we are optimizing MI neural variational bounds. However, optimizing these bounds is challenging due to its large variance in practice. To address this issue, we develop two methods (free of optimizing MI variational bounds): Probabilistic Classifier and Density-Ratio Fitting.We demonstrate the effectiveness of our approaches in 1) MI estimation, 2) self-supervised representation learning, and 3) cross-modal retrieval task.
AB - Since its inception, the neural estimation of mutual information (MI) has demonstrated the empirical success of modeling expected dependency between high-dimensional random variables. However, MI is an aggregate statistic and cannot be used to measure point-wise dependency between different events. In this work, instead of estimating the expected dependency, we focus on estimating point-wise dependency (PD), which quantitatively measures how likely two outcomes co-occur. We show that we can naturally obtain PD when we are optimizing MI neural variational bounds. However, optimizing these bounds is challenging due to its large variance in practice. To address this issue, we develop two methods (free of optimizing MI variational bounds): Probabilistic Classifier and Density-Ratio Fitting.We demonstrate the effectiveness of our approaches in 1) MI estimation, 2) self-supervised representation learning, and 3) cross-modal retrieval task.
UR - http://www.scopus.com/inward/record.url?scp=85101667046&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85101667046&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85101667046
SN - 1049-5258
VL - 2020-December
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
T2 - 34th Conference on Neural Information Processing Systems, NeurIPS 2020
Y2 - 6 December 2020 through 12 December 2020
ER -