A Theoretical Analysis on Independence-driven Importance Weighting for Covariate-shift Generalization

Renzhe Xu, Xingxuan Zhang, Zheyan Shen, Tong Zhang, Peng Cui

Research output: Contribution to journalConference articlepeer-review

Abstract

Covariate-shift generalization, a typical case in out-of-distribution (OOD) generalization, requires a good performance on the unknown test distribution, which varies from the accessible training distribution in the form of covariate shift. Recently, independence-driven importance weighting algorithms in stable learning literature have shown empirical effectiveness to deal with covariate-shift generalization on several learning models, including regression algorithms and deep neural networks, while their theoretical analyses are missing. In this paper, we theoretically prove the effectiveness of such algorithms by explaining them as feature selection processes. We first specify a set of variables, named minimal stable variable set, that is the minimal and optimal set of variables to deal with covariate-shift generalization for common loss functions, such as the mean squared loss and binary cross-entropy loss. Afterward, we prove that under ideal conditions, independence-driven importance weighting algorithms could identify the variables in this set. Analysis of asymptotic properties is also provided. These theories are further validated in several synthetic experiments. The source code is available at https://github.com/windxrz/independence-driven-IW.

Original languageEnglish (US)
Pages (from-to)24803-24829
Number of pages27
JournalProceedings of Machine Learning Research
Volume162
StatePublished - 2022
Externally publishedYes
Event39th International Conference on Machine Learning, ICML 2022 - Baltimore, United States
Duration: Jul 17 2022Jul 23 2022

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'A Theoretical Analysis on Independence-driven Importance Weighting for Covariate-shift Generalization'. Together they form a unique fingerprint.

Cite this