Rare category characterization

Jing Rui He, Hanghang Tong, Jaime Carbonell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Rare categories abound and their characterization has heretofore received little attention. Fraudulent banking transactions, network intrusions, and rare diseases are examples of rare classes whose detection and characterization are of high value. However, accurate characterization is challenging due to high-skewness and non-separability from majority classes, e.g., fraudulent transactions masquerade as legitimate ones. This paper proposes the RACH algorithm by exploring the compactness property of the rare categories. It is based on an optimization framework which encloses the rare examples by a minimum-radius hyperball. The framework is then converted into a convex optimization problem, which is in turn effectively solved in its dual form by the projected subgradient method. RACH can be naturally kernelized. Experimental results validate the effectiveness of RACH.

Original languageEnglish (US)
Title of host publicationProceedings - 10th IEEE International Conference on Data Mining, ICDM 2010
Pages226-235
Number of pages10
DOIs
StatePublished - Dec 1 2010
Externally publishedYes
Event10th IEEE International Conference on Data Mining, ICDM 2010 - Sydney, NSW, Australia
Duration: Dec 14 2010Dec 17 2010

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Other

Other10th IEEE International Conference on Data Mining, ICDM 2010
CountryAustralia
CitySydney, NSW
Period12/14/1012/17/10

Keywords

  • Characterization
  • Compactness
  • Hyperball
  • Minority class
  • Optimization
  • Rare category
  • Subgradient

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Rare category characterization'. Together they form a unique fingerprint.

  • Cite this

    He, J. R., Tong, H., & Carbonell, J. (2010). Rare category characterization. In Proceedings - 10th IEEE International Conference on Data Mining, ICDM 2010 (pp. 226-235). [5693976] (Proceedings - IEEE International Conference on Data Mining, ICDM). https://doi.org/10.1109/ICDM.2010.154