An effective framework for characterizing rare categories

Jingrui He, Hanghang Tong, Jaime Carbonell

Research output: Contribution to journalArticlepeer-review

Abstract

Rare categories become more and more abundant and their characterization has received little attention thus far. Fraudulent banking transactions, network intrusions, and rare diseases are examples of rare classes whose detection and characterization are of high value. However, accurate characterization is challenging due to high-skewness and nonseparability from majority classes, e. g., fraudulent transactions masquerade as legitimate ones. This paper proposes the RACH algorithm by exploring the compactness property of the rare categories. This algorithm is semi-supervised in nature since it uses both labeled and unlabeled data. It is based on an optimization framework which encloses the rare examples by a minimum-radius hyperball. The framework is then converted into a convex optimization problem, which is in turn effectively solved in its dual form by the projected subgradient method. RACH can be naturally kernelized. Experimental results validate the effectiveness of RACH.

Original languageEnglish (US)
Pages (from-to)154-165
Number of pages12
JournalFrontiers of Computer Science in China
Volume6
Issue number2
DOIs
StatePublished - Apr 2012
Externally publishedYes

Keywords

  • characterization
  • compactness
  • hyperball
  • minority class
  • optimization
  • rare category
  • subgradient

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'An effective framework for characterizing rare categories'. Together they form a unique fingerprint.

Cite this