TY - JOUR
T1 - f-divergence variational inference
AU - Wan, Neng
AU - Li, Dapeng
AU - Hovakimyan, Naira
N1 - Funding Information:
This work was supported by AFSOR under Grant FA9550-15-1-0518 and NSF NRI under Grant ECCS-1830639. The authors would like to thank the anonymous editors and reviewers for their constructive comments, Dr. Xinyue Chang (Iowa State Univ.), Lei Ding (Univ. of Alberta), Zhaobin Kuang (Stanford), Yang Wang (Univ. of Alabama), and Yanbo Xu (Georgia Tech.) for their helpful suggestions, and Prof. Evangelos A. Theodorou for his heuristic and insightful comments on this paper.
Publisher Copyright:
© 2020 Neural information processing systems foundation. All rights reserved.
PY - 2020
Y1 - 2020
N2 - This paper introduces the f-divergence variational inference (f-VI) that generalizes variational inference to all f-divergences. Initiated from minimizing a crafty surrogate f-divergence that shares the statistical consistency with the f-divergence, the f-VI framework not only unifies a number of existing VI methods, e.g. Kullback–Leibler VI [1], Rényi’s a-VI [2], and ?-VI [3], but offers a standardized toolkit for VI subject to arbitrary divergences from f-divergence family. A general f-variational bound is derived and provides a sandwich estimate of marginal likelihood (or evidence). The development of the f-VI unfolds with a stochastic optimization scheme that utilizes the reparameterization trick, importance weighting and Monte Carlo approximation; a mean-field approximation scheme that generalizes the well-known coordinate ascent variational inference (CAVI) is also proposed for f-VI. Empirical examples, including variational autoencoders and Bayesian neural networks, are provided to demonstrate the effectiveness and the wide applicability of f-VI.
AB - This paper introduces the f-divergence variational inference (f-VI) that generalizes variational inference to all f-divergences. Initiated from minimizing a crafty surrogate f-divergence that shares the statistical consistency with the f-divergence, the f-VI framework not only unifies a number of existing VI methods, e.g. Kullback–Leibler VI [1], Rényi’s a-VI [2], and ?-VI [3], but offers a standardized toolkit for VI subject to arbitrary divergences from f-divergence family. A general f-variational bound is derived and provides a sandwich estimate of marginal likelihood (or evidence). The development of the f-VI unfolds with a stochastic optimization scheme that utilizes the reparameterization trick, importance weighting and Monte Carlo approximation; a mean-field approximation scheme that generalizes the well-known coordinate ascent variational inference (CAVI) is also proposed for f-VI. Empirical examples, including variational autoencoders and Bayesian neural networks, are provided to demonstrate the effectiveness and the wide applicability of f-VI.
UR - http://www.scopus.com/inward/record.url?scp=85102664011&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102664011&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85102664011
SN - 1049-5258
VL - 2020-December
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
T2 - 34th Conference on Neural Information Processing Systems, NeurIPS 2020
Y2 - 6 December 2020 through 12 December 2020
ER -