Rare category detection on time-evolving graphs

Dawei Zhou, Kangyang Wang, Nan Cao, Jingrui He

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Rare category detection(RCD) is an important topicin data mining, focusing on identifying the initial examples fromrare classes in imbalanced data sets. This problem becomes more challenging when the data is presented as time-evolving graphs, as used in synthetic ID detection and insider threat detection. Most existing techniques for RCD are designed for static data sets, thus not suitable for time-evolving RCD applications. To address this challenge, in this paper, we first proposetwo incremental RCD algorithms, SIRD and BIRD. They arebuilt upon existing density-based techniques for RCD, andincrementally update the detection models, which provide 'timeflexible' RCD. Furthermore, based on BIRD, we propose amodified version named BIRD-LI to deal with the cases wherethe exact priors of the minority classes are not available. Wealso identify a critical task in RCD named query distribution. Itaims to allocate the limited budget into multiple time steps, suchthat the initial examples from the rare classes are detected asearly as possible with the minimum labeling cost. The proposedincremental RCD algorithms and various query distributionstrategies are evaluated empirically on both synthetic and real data.

Original languageEnglish (US)
Title of host publicationProceedings - 15th IEEE International Conference on Data Mining, ICDM 2015
EditorsCharu Aggarwal, Zhi-Hua Zhou, Alexander Tuzhilin, Hui Xiong, Xindong Wu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1135-1140
Number of pages6
ISBN (Electronic)9781467395038
DOIs
StatePublished - Jan 5 2016
Externally publishedYes
Event15th IEEE International Conference on Data Mining, ICDM 2015 - Atlantic City, United States
Duration: Nov 14 2015Nov 17 2015

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
Volume2016-January
ISSN (Print)1550-4786

Other

Other15th IEEE International Conference on Data Mining, ICDM 2015
Country/TerritoryUnited States
CityAtlantic City
Period11/14/1511/17/15

Keywords

  • Incremental Learning
  • Rare Category Detection
  • Time-evolving Graph Mining

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Rare category detection on time-evolving graphs'. Together they form a unique fingerprint.

Cite this