Detecting recurring and novel classes in concept-drifting data streams

Mohammad M. Masud, Tahseen M. Al-Khateeb, Latifur Khan, Charu Aggarwal, Jing Gao, Jiawei Han, Bhavani Thuraisingham

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Concept-evolution is one of the major challenges in data stream classification, which occurs when a new class evolves in the stream. This problem remains unaddressed by most state-of-the-art techniques. A recurring class is a special case of concept-evolution. This special case takes place when a class appears in the stream, then disappears for a long time, and again appears. Existing data stream classification techniques that address the concept-evolution problem, wrongly detect the recurring classes as novel class. This creates two main problems. First, much resource is wasted in detecting a recurring class as novel class, because novel class detection is much more computationally- and memory-intensive, as compared to simply recognizing an existing class. Second, when a novel class is identified, human experts are involved in collecting and labeling the instances of that class for future modeling. If a recurrent class is reported as novel class, it will be only a waste of human effort to find out whether it is really a novel class. In this paper, we address the recurring issue, and propose a more realistic novel class detection technique, which remembers a class and identifies it as "not novel" when it reappears after a long disappearance. Our approach has shown significant reduction in classification error over state-of-the-art stream classification techniques on several benchmark data streams.

Original languageEnglish (US)
Title of host publicationProceedings - 11th IEEE International Conference on Data Mining, ICDM 2011
Pages1176-1181
Number of pages6
DOIs
StatePublished - Dec 1 2011
Event11th IEEE International Conference on Data Mining, ICDM 2011 - Vancouver, BC, Canada
Duration: Dec 11 2011Dec 14 2011

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Other

Other11th IEEE International Conference on Data Mining, ICDM 2011
CountryCanada
CityVancouver, BC
Period12/11/1112/14/11

Keywords

  • Novel class
  • Recurring class
  • Stream classification

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Detecting recurring and novel classes in concept-drifting data streams'. Together they form a unique fingerprint.

Cite this