Fairness-aware Multi-view Clustering

Lecheng Zheng, Yada Zhu, Jingrui He

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In the era of big data, we are often facing the challenge of data heterogeneity and the lack of label information simultaneously. In the financial domain (e.g., fraud detection), the heterogeneous data may include not only numerical data (e.g., total debt and yearly income), but also text and images (e.g., financial statement and invoice images). At the same time, the label information (e.g., fraud transactions) may be missing for building predictive models. To address these challenges, many state-of-the-art multi-view clustering methods have been proposed and achieved outstanding performance. However, these methods typically do not take into consideration the fairness aspect and are likely to generate biased results using sensitive information such as race and gender. Therefore, in this paper, we propose a fairness-aware multi-view clustering method named Fair-MVC. It incorporates the group fairness constraint into the soft membership assignment for each cluster to ensure that the fraction of different groups in each cluster is approximately identical to the entire data set. Meanwhile, we adopt the idea of both contrastive learning and non-contrastive learning and propose novel regularizers to handle heterogeneous data in complex scenarios with missing data or noisy features. Experimental results on real-world data sets demonstrate the effectiveness and efficiency of the proposed framework. We also derive insights regarding the relative performance of the proposed regularizers in various scenarios.

Original languageEnglish (US)
Title of host publication2023 SIAM International Conference on Data Mining, SDM 2023
PublisherSociety for Industrial and Applied Mathematics Publications
Pages856-864
Number of pages9
ISBN (Electronic)9781611977653
StatePublished - 2023
Event2023 SIAM International Conference on Data Mining, SDM 2023 - Minneapolis, United States
Duration: Apr 27 2023Apr 29 2023

Publication series

Name2023 SIAM International Conference on Data Mining, SDM 2023

Conference

Conference2023 SIAM International Conference on Data Mining, SDM 2023
Country/TerritoryUnited States
CityMinneapolis
Period4/27/234/29/23

Keywords

  • Clustering
  • Contrastive Learning
  • Multi-view Learning

ASJC Scopus subject areas

  • Education
  • Information Systems

Fingerprint

Dive into the research topics of 'Fairness-aware Multi-view Clustering'. Together they form a unique fingerprint.

Cite this