Unsupervised Node Clustering via Contrastive Hard Sampling

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper introduces a fine-grained contrastive learning scheme for unsupervised node clustering. Previous clustering methods only focus on a small feature set (class-dependent features), which demonstrates explicit clustering characteristics, ignoring the rest of the feature spaces (class-invariant features). This paper exploits class-invariant features via graph contrastive learning to discover additional high-quality features for unsupervised clustering. We formulate a novel node-level fine-grained augmentation framework for self-supervised learning, which iteratively identifies competitive contrastive samples from the whole feature spaces, in the form of positive and negative examples of node relations. While positive examples of node relations are usually expressed as edges in graph homophily, negative examples are implicit without a direct edge. We show, however, that simply sampling nodes beyond the local neighborhood results in less competitive negative pairs, that are less effective for contrastive learning. Inspired by counterfactual augmentation, we instead sample competitive negative node relations by creating virtual nodes that inherit (in a self-supervised fashion) class-invariant features, while altering class-dependent features, creating contrasting pairs that lie closer to the boundary and offering better contrast. Consequently, our experiments demonstrate significant improvements in supervised node clustering tasks on six baselines and six real-world social network datasets.

Original languageEnglish (US)
Title of host publicationDatabase Systems for Advanced Applications - 29th International Conference, DASFAA 2024, Proceedings
EditorsMakoto Onizuka, Jae-Gil Lee, Yongxin Tong, Chuan Xiao, Yoshiharu Ishikawa, Kejing Lu, Sihem Amer-Yahia, H.V. Jagadish
PublisherSpringer
Pages285-300
Number of pages16
ISBN (Print)9789819755714
DOIs
StatePublished - 2024
Event29th International Conference on Database Systems for Advanced Applications, DASFAA 2024 - Gifu, Japan
Duration: Jul 2 2024Jul 5 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14855 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference29th International Conference on Database Systems for Advanced Applications, DASFAA 2024
Country/TerritoryJapan
CityGifu
Period7/2/247/5/24

Keywords

  • Clustering
  • Contrastive learning

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Unsupervised Node Clustering via Contrastive Hard Sampling'. Together they form a unique fingerprint.

Cite this