Discriminative frequent pattern analysis for effective classification

Hong Cheng, Xifeng Yan, Jiawei Han, Chih Wei Hsu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The application of frequent patterns in classification appeared in sporadic studies and achieved initial success in the classification of relational data, text documents and graphs. In this paper, we conduct a systematic exploration of frequent pattern-based classification, and provide solid reasons supporting this methodology. It was well known that feature combinations (patterns) could capture more underlying semantics than single features. However, inclusion of infrequent patterns may not significantly improve the accuracy due to their limited predictive power. By building a connection between pattern frequency and discriminative measures such as information gain and Fisher score, we develop a strategy to set minimum support in frequent pattern mining for generating useful patterns. Based on this strategy, coupled with a proposed feature selection algorithm, discriminative frequent patterns can be generated for building high quality classifiers. We demonstrate that the frequent pattern-based classification framework can achieve good scalability and high accuracy in classifying large datasets. Empirical studies indicate that significant improvement in classification accuracy is achieved (up to 12% in UCI datasets) using the so-selected discriminative frequent patterns.

Original languageEnglish (US)
Title of host publication23rd International Conference on Data Engineering, ICDE 2007
Pages716-725
Number of pages10
DOIs
StatePublished - Sep 24 2007
Event23rd International Conference on Data Engineering, ICDE 2007 - Istanbul, Turkey
Duration: Apr 15 2007Apr 20 2007

Publication series

NameProceedings - International Conference on Data Engineering
ISSN (Print)1084-4627

Other

Other23rd International Conference on Data Engineering, ICDE 2007
Country/TerritoryTurkey
CityIstanbul
Period4/15/074/20/07

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Fingerprint

Dive into the research topics of 'Discriminative frequent pattern analysis for effective classification'. Together they form a unique fingerprint.

Cite this