TY - GEN
T1 - On the Importance of Gradient Norm in PAC-Bayesian Bounds
AU - Gat, Itai
AU - Adi, Yossi
AU - Schwing, Alexander
AU - Hazan, Tamir
N1 - Acknowledgements: This work is supported in part by NSF under Grant # 2008387, and, BSF under Grant # 2019783.
PY - 2022
Y1 - 2022
N2 - Generalization bounds which assess the difference between the true risk and the empirical risk have been studied extensively. However, to obtain bounds, current techniques use strict assumptions such as a uniformly bounded or a Lipschitz loss function. To avoid these assumptions, in this paper, we follow an alternative approach: we relax uniform bounds assumptions by using on-average bounded loss and on-average bounded gradient norm assumptions. Following this relaxation, we propose a new generalization bound that exploits the contractivity of the log-Sobolev inequalities. These inequalities add an additional loss-gradient norm term to the generalization bound, which is intuitively a surrogate of the model complexity. We apply the proposed bound on Bayesian deep nets and empirically analyze the effect of this new loss-gradient norm term on different neural architectures.
AB - Generalization bounds which assess the difference between the true risk and the empirical risk have been studied extensively. However, to obtain bounds, current techniques use strict assumptions such as a uniformly bounded or a Lipschitz loss function. To avoid these assumptions, in this paper, we follow an alternative approach: we relax uniform bounds assumptions by using on-average bounded loss and on-average bounded gradient norm assumptions. Following this relaxation, we propose a new generalization bound that exploits the contractivity of the log-Sobolev inequalities. These inequalities add an additional loss-gradient norm term to the generalization bound, which is intuitively a surrogate of the model complexity. We apply the proposed bound on Bayesian deep nets and empirically analyze the effect of this new loss-gradient norm term on different neural architectures.
UR - http://www.scopus.com/inward/record.url?scp=85163152656&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85163152656&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85163152656
T3 - Advances in Neural Information Processing Systems
BT - Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
A2 - Koyejo, S.
A2 - Mohamed, S.
A2 - Agarwal, A.
A2 - Belgrave, D.
A2 - Cho, K.
A2 - Oh, A.
PB - Neural information processing systems foundation
T2 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022
Y2 - 28 November 2022 through 9 December 2022
ER -