TY - JOUR
T1 - α-variational inference with statistical guarantees
AU - Yang, Yun
AU - Pati, Debdeep
AU - Bhattacharya, Anirban
N1 - Funding Information:
Acknowledgments. The first author was supported by NSF Grant DMS-1810831. The second author was supported by NSF Grant DMS-1613156 and NSF CAREER Grant DMS-1653404. The third author was supported by NSF Grant DMS-1613156.
Publisher Copyright:
© Institute of Mathematical Statistics, 2020
PY - 2020
Y1 - 2020
N2 - We provide statistical guarantees for a family of variational approximations to Bayesian posterior distributions, called α-VB, which has close connections with variational approximations of tempered posteriors in the literature. The standard variational approximation is a special case of α-VB with α = 1. When α ∈ (0, 1], a novel class of variational inequalities are developed for linking the Bayes risk under the variational approximation to the objective function in the variational optimization problem, implying that maximizing the evidence lower bound in variational inference has the effect of minimizing the Bayes risk within the variational density family. Operating in a frequentist setup, the variational inequalities imply that point estimates constructed from the α-VB procedure converge at an optimal rate to the true parameter in a wide range of problems. We illustrate our general theory with a number of examples, including the mean-field variational approximation to (low)-high-dimensional Bayesian linear regression with spike and slab priors, Gaussian mixture models and latent Dirichlet allocation.
AB - We provide statistical guarantees for a family of variational approximations to Bayesian posterior distributions, called α-VB, which has close connections with variational approximations of tempered posteriors in the literature. The standard variational approximation is a special case of α-VB with α = 1. When α ∈ (0, 1], a novel class of variational inequalities are developed for linking the Bayes risk under the variational approximation to the objective function in the variational optimization problem, implying that maximizing the evidence lower bound in variational inference has the effect of minimizing the Bayes risk within the variational density family. Operating in a frequentist setup, the variational inequalities imply that point estimates constructed from the α-VB procedure converge at an optimal rate to the true parameter in a wide range of problems. We illustrate our general theory with a number of examples, including the mean-field variational approximation to (low)-high-dimensional Bayesian linear regression with spike and slab priors, Gaussian mixture models and latent Dirichlet allocation.
KW - Bayes risk
KW - Evidence lower bound
KW - Latent variable models
KW - Rényi divergence
KW - Variational inference
UR - http://www.scopus.com/inward/record.url?scp=85091170954&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85091170954&partnerID=8YFLogxK
U2 - 10.1214/19-AOS1827
DO - 10.1214/19-AOS1827
M3 - Article
AN - SCOPUS:85091170954
SN - 0090-5364
VL - 48
SP - 886
EP - 905
JO - Annals of Statistics
JF - Annals of Statistics
IS - 2
ER -