Mining concept-drifting data streams using ensemble classifiers

Haixun Wang, Wei Fan, Philip S. Yu, Jiawei Han

Research output: Contribution to conferencePaperpeer-review


Recently, mining data streams with concept drifts for actionable insights has become an important and challenging task for a wide range of applications including credit card fraud protection, target marketing, network intrusion detection, etc. Conventional knowledge discovery tools are facing two challenges, the overwhelming volume of the streaming data, and the concept drifts. In this paper, we propose a general framework for mining concept-drifting data streams using weighted ensemble classifiers. We train an ensemble of classification models, such as C4.5, RIPPER, naive Beyesian, etc., from sequential chunks of the data stream. The classifiers in the ensemble are judiciously weighted based on their expected classification accuracy on the test data under the time-evolving environment. Thus, the ensemble approach improves both the efficiency in learning the model and the accuracy in performing classification. Our empirical study shows that the proposed methods have substantial advantage over single-classifier approaches in prediction accuracy, and the ensemble framework is effective for a variety of classification models.

Original languageEnglish (US)
Number of pages10
StatePublished - 2003
Event9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '03 - Washington, DC, United States
Duration: Aug 24 2003Aug 27 2003


Other9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '03
Country/TerritoryUnited States
CityWashington, DC


  • Classifier
  • Classifier ensemble
  • Concept drift
  • Data streams

ASJC Scopus subject areas

  • Software
  • Information Systems


Dive into the research topics of 'Mining concept-drifting data streams using ensemble classifiers'. Together they form a unique fingerprint.

Cite this