TY - GEN
T1 - Fairness-aware Model-agnostic Positive and Unlabeled Learning
AU - Wu, Ziwei
AU - He, Jingrui
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/6/21
Y1 - 2022/6/21
N2 - With the increasing application of machine learning in high-stake decision-making problems, potential algorithmic bias towards people from certain social groups poses negative impacts on individuals and our society at large. In the real-world scenario, many such problems involve positive and unlabeled data such as medical diagnosis, criminal risk assessment and recommender systems. For instance, in medical diagnosis, only the diagnosed diseases will be recorded (positive) while others will not (unlabeled). Despite the large amount of existing work on fairness-aware machine learning in the (semi-)supervised and unsupervised settings, the fairness issue is largely under-explored in the aforementioned Positive and Unlabeled Learning (PUL) context, where it is usually more severe. In this paper, to alleviate this tension, we propose a fairness-aware PUL method named FairPUL. In particular, for binary classification over individuals from two populations, we aim to achieve similar true positive rates and false positive rates in both populations as our fairness metric. Based on the analysis of the optimal fair classifier for PUL, we design a model-agnostic post-processing framework, leveraging both the positive examples and unlabeled ones. Our framework is proven to be statistically consistent in terms of both the classification error and the fairness metric. Experiments on the synthetic and real-world data sets demonstrate that our framework outperforms state-of-the-art in both PUL and fair classification.
AB - With the increasing application of machine learning in high-stake decision-making problems, potential algorithmic bias towards people from certain social groups poses negative impacts on individuals and our society at large. In the real-world scenario, many such problems involve positive and unlabeled data such as medical diagnosis, criminal risk assessment and recommender systems. For instance, in medical diagnosis, only the diagnosed diseases will be recorded (positive) while others will not (unlabeled). Despite the large amount of existing work on fairness-aware machine learning in the (semi-)supervised and unsupervised settings, the fairness issue is largely under-explored in the aforementioned Positive and Unlabeled Learning (PUL) context, where it is usually more severe. In this paper, to alleviate this tension, we propose a fairness-aware PUL method named FairPUL. In particular, for binary classification over individuals from two populations, we aim to achieve similar true positive rates and false positive rates in both populations as our fairness metric. Based on the analysis of the optimal fair classifier for PUL, we design a model-agnostic post-processing framework, leveraging both the positive examples and unlabeled ones. Our framework is proven to be statistically consistent in terms of both the classification error and the fairness metric. Experiments on the synthetic and real-world data sets demonstrate that our framework outperforms state-of-the-art in both PUL and fair classification.
KW - Fairness
KW - Machine Learning
KW - Positive and Unlabeled Learning
UR - http://www.scopus.com/inward/record.url?scp=85132970036&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85132970036&partnerID=8YFLogxK
U2 - 10.1145/3531146.3533225
DO - 10.1145/3531146.3533225
M3 - Conference contribution
AN - SCOPUS:85132970036
T3 - ACM International Conference Proceeding Series
SP - 1698
EP - 1708
BT - Proceedings of 2022 5th ACM Conference on Fairness, Accountability, and Transparency, FAccT 2022
PB - Association for Computing Machinery
T2 - 5th ACM Conference on Fairness, Accountability, and Transparency, FAccT 2022
Y2 - 21 June 2022 through 24 June 2022
ER -