TY - JOUR
T1 - Differentially Private Post-Processing for Fair Regression
AU - Xian, Ruicheng
AU - Li, Qiaobo
AU - Kamath, Gautam
AU - Zhao, Han
N1 - GK is supported by a Canada CIFAR AI Grant, an NSERC Discovery Grant, and an unrestricted gift from Google. HZ is partially supported by a research grant from the Amazon-Illinois Center on AI for Interactive Conversational Experiences (AICE) and a Google Research Scholar Award.
PY - 2024
Y1 - 2024
N2 - This paper describes a differentially private post-processing algorithm for learning fair regressors satisfying statistical parity, addressing privacy concerns of machine learning models trained on sensitive data, as well as fairness concerns of their potential to propagate historical biases. Our algorithm can be applied to post-process any given regressor to improve fairness by remapping its outputs. It consists of three steps: first, the output distributions are estimated privately via histogram density estimation and the Laplace mechanism, then their Wasserstein barycenter is computed, and the optimal transports to the barycenter are used for post-processing to satisfy fairness. We analyze the sample complexity of our algorithm and provide fairness guarantee, revealing a tradeoff between the statistical bias and variance induced from the choice of the number of bins in the histogram, in which using less bins always favors fairness at the expense of error.
AB - This paper describes a differentially private post-processing algorithm for learning fair regressors satisfying statistical parity, addressing privacy concerns of machine learning models trained on sensitive data, as well as fairness concerns of their potential to propagate historical biases. Our algorithm can be applied to post-process any given regressor to improve fairness by remapping its outputs. It consists of three steps: first, the output distributions are estimated privately via histogram density estimation and the Laplace mechanism, then their Wasserstein barycenter is computed, and the optimal transports to the barycenter are used for post-processing to satisfy fairness. We analyze the sample complexity of our algorithm and provide fairness guarantee, revealing a tradeoff between the statistical bias and variance induced from the choice of the number of bins in the histogram, in which using less bins always favors fairness at the expense of error.
UR - http://www.scopus.com/inward/record.url?scp=85203819559&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85203819559&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85203819559
SN - 2640-3498
VL - 235
SP - 54212
EP - 54235
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 41st International Conference on Machine Learning, ICML 2024
Y2 - 21 July 2024 through 27 July 2024
ER -