TY - GEN
T1 - Fairgen
T2 - 40th IEEE International Conference on Data Engineering, ICDE 2024
AU - Zheng, Lecheng
AU - Zhou, Dawei
AU - Tong, Hanghang
AU - Xu, Jiejun
AU - Zhu, Yada
AU - He, Jingrui
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - There have been tremendous efforts over the past decades dedicated to the generation of realistic graphs in a variety of domains, ranging from social networks to computer networks, from gene regulatory networks to online transaction networks. Despite the remarkable success, the vast majority of these works are unsupervised in nature and are typically trained to minimize the expected graph reconstruction loss, which would result in the representation disparity issue in the generated graphs, i.e., the protected groups (often minorities) contribute less to the objective and thus suffer from systematically higher errors. In this paper, we aim to tailor graph generation to downstream mining tasks by leveraging label information and user-preferred parity constraints. In particular, we start from the investigation of representation disparity in the context of graph generative models. To mitigate the disparity, we propose a fairness-aware graph generative model named Fairgen. Our model jointly trains a label-informed graph generation module and a fair representation learning module by progressively learning the behaviors of the protected and unprotected groups, from the 'easy' concepts to the 'hard' ones. In addition, we propose a generic context sampling strategy for graph generative models, which is proven to be capable of fairly capturing the contextual information of each group with a high probability. Experimental results on seven real-world data sets demonstrate that Fairgen (1) obtains performance on par with state-of-the-art graph generative models across nine network properties, (2) mitigates the representation disparity issues in the generated graphs, and (3) substantially boosts the model performance by up to 17% in downstream tasks via data augmentation.
AB - There have been tremendous efforts over the past decades dedicated to the generation of realistic graphs in a variety of domains, ranging from social networks to computer networks, from gene regulatory networks to online transaction networks. Despite the remarkable success, the vast majority of these works are unsupervised in nature and are typically trained to minimize the expected graph reconstruction loss, which would result in the representation disparity issue in the generated graphs, i.e., the protected groups (often minorities) contribute less to the objective and thus suffer from systematically higher errors. In this paper, we aim to tailor graph generation to downstream mining tasks by leveraging label information and user-preferred parity constraints. In particular, we start from the investigation of representation disparity in the context of graph generative models. To mitigate the disparity, we propose a fairness-aware graph generative model named Fairgen. Our model jointly trains a label-informed graph generation module and a fair representation learning module by progressively learning the behaviors of the protected and unprotected groups, from the 'easy' concepts to the 'hard' ones. In addition, we propose a generic context sampling strategy for graph generative models, which is proven to be capable of fairly capturing the contextual information of each group with a high probability. Experimental results on seven real-world data sets demonstrate that Fairgen (1) obtains performance on par with state-of-the-art graph generative models across nine network properties, (2) mitigates the representation disparity issues in the generated graphs, and (3) substantially boosts the model performance by up to 17% in downstream tasks via data augmentation.
KW - Deep Learning
KW - Fairness
KW - Graph Generative Model
KW - Self-paced Learning
UR - http://www.scopus.com/inward/record.url?scp=85200472340&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85200472340&partnerID=8YFLogxK
U2 - 10.1109/ICDE60146.2024.00181
DO - 10.1109/ICDE60146.2024.00181
M3 - Conference contribution
AN - SCOPUS:85200472340
T3 - Proceedings - International Conference on Data Engineering
SP - 2285
EP - 2297
BT - Proceedings - 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024
PB - IEEE Computer Society
Y2 - 13 May 2024 through 17 May 2024
ER -