Finding frequent items in data streams using hierarchical information

Wang Xiaoyu, Liu Hongyan, Han Jiawei

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Finding frequent items or top-k items in data streams is a basic mining task with a wide range of applications. There are lots of algorithms proposed to enhance the performance of these algorithms, whereas not much effort has been made to make use of hierarchical information held by items in data stream. In this paper, we try to improve the accuracy of finding frequent items using hierarchical information in taxonomy. To do that, we propose a method called Merge. According to the strategy, we design and implement an algorithm, named FISH_Merge. In order to evaluate the performance of the algorithm, we propose three new measures for testing, and develop a hierarchical stream data generator. After conducting a comprehensive experimental study, we conclude that accuracy of FISH_Merge is better than algorithms without using hierarchical information under same amount of memory. In the meantime, our algorithm can also provide some information of higher level items.

Original languageEnglish (US)
Title of host publication2007 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2007
Pages431-436
Number of pages6
DOIs
StatePublished - 2007
Event2007 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2007 - Montreal, QC, Canada
Duration: Oct 7 2007Oct 10 2007

Publication series

NameConference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
ISSN (Print)1062-922X

Other

Other2007 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2007
Country/TerritoryCanada
CityMontreal, QC
Period10/7/0710/10/07

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Finding frequent items in data streams using hierarchical information'. Together they form a unique fingerprint.

Cite this