TY - JOUR
T1 - A Theoretical Analysis on Independence-driven Importance Weighting for Covariate-shift Generalization
AU - Xu, Renzhe
AU - Zhang, Xingxuan
AU - Shen, Zheyan
AU - Zhang, Tong
AU - Cui, Peng
N1 - This work was supported in part by National Key R&D Program of China (No. 2018AAA0102004), National Natural Science Foundation of China (No. 62141607, U1936219), and Beijing Academy of Artificial Intelligence (BAAI).
PY - 2022
Y1 - 2022
N2 - Covariate-shift generalization, a typical case in out-of-distribution (OOD) generalization, requires a good performance on the unknown test distribution, which varies from the accessible training distribution in the form of covariate shift. Recently, independence-driven importance weighting algorithms in stable learning literature have shown empirical effectiveness to deal with covariate-shift generalization on several learning models, including regression algorithms and deep neural networks, while their theoretical analyses are missing. In this paper, we theoretically prove the effectiveness of such algorithms by explaining them as feature selection processes. We first specify a set of variables, named minimal stable variable set, that is the minimal and optimal set of variables to deal with covariate-shift generalization for common loss functions, such as the mean squared loss and binary cross-entropy loss. Afterward, we prove that under ideal conditions, independence-driven importance weighting algorithms could identify the variables in this set. Analysis of asymptotic properties is also provided. These theories are further validated in several synthetic experiments. The source code is available at https://github.com/windxrz/independence-driven-IW.
AB - Covariate-shift generalization, a typical case in out-of-distribution (OOD) generalization, requires a good performance on the unknown test distribution, which varies from the accessible training distribution in the form of covariate shift. Recently, independence-driven importance weighting algorithms in stable learning literature have shown empirical effectiveness to deal with covariate-shift generalization on several learning models, including regression algorithms and deep neural networks, while their theoretical analyses are missing. In this paper, we theoretically prove the effectiveness of such algorithms by explaining them as feature selection processes. We first specify a set of variables, named minimal stable variable set, that is the minimal and optimal set of variables to deal with covariate-shift generalization for common loss functions, such as the mean squared loss and binary cross-entropy loss. Afterward, we prove that under ideal conditions, independence-driven importance weighting algorithms could identify the variables in this set. Analysis of asymptotic properties is also provided. These theories are further validated in several synthetic experiments. The source code is available at https://github.com/windxrz/independence-driven-IW.
UR - http://www.scopus.com/inward/record.url?scp=85144289202&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85144289202&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85144289202
SN - 2640-3498
VL - 162
SP - 24803
EP - 24829
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 39th International Conference on Machine Learning, ICML 2022
Y2 - 17 July 2022 through 23 July 2022
ER -