TY - GEN
T1 - Finding frequent items in data streams using hierarchical information
AU - Xiaoyu, Wang
AU - Hongyan, Liu
AU - Jiawei, Han
PY - 2007
Y1 - 2007
N2 - Finding frequent items or top-k items in data streams is a basic mining task with a wide range of applications. There are lots of algorithms proposed to enhance the performance of these algorithms, whereas not much effort has been made to make use of hierarchical information held by items in data stream. In this paper, we try to improve the accuracy of finding frequent items using hierarchical information in taxonomy. To do that, we propose a method called Merge. According to the strategy, we design and implement an algorithm, named FISH_Merge. In order to evaluate the performance of the algorithm, we propose three new measures for testing, and develop a hierarchical stream data generator. After conducting a comprehensive experimental study, we conclude that accuracy of FISH_Merge is better than algorithms without using hierarchical information under same amount of memory. In the meantime, our algorithm can also provide some information of higher level items.
AB - Finding frequent items or top-k items in data streams is a basic mining task with a wide range of applications. There are lots of algorithms proposed to enhance the performance of these algorithms, whereas not much effort has been made to make use of hierarchical information held by items in data stream. In this paper, we try to improve the accuracy of finding frequent items using hierarchical information in taxonomy. To do that, we propose a method called Merge. According to the strategy, we design and implement an algorithm, named FISH_Merge. In order to evaluate the performance of the algorithm, we propose three new measures for testing, and develop a hierarchical stream data generator. After conducting a comprehensive experimental study, we conclude that accuracy of FISH_Merge is better than algorithms without using hierarchical information under same amount of memory. In the meantime, our algorithm can also provide some information of higher level items.
UR - http://www.scopus.com/inward/record.url?scp=40949113226&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=40949113226&partnerID=8YFLogxK
U2 - 10.1109/ICSMC.2007.4413754
DO - 10.1109/ICSMC.2007.4413754
M3 - Conference contribution
AN - SCOPUS:40949113226
SN - 1424409918
SN - 9781424409914
T3 - Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics
SP - 431
EP - 436
BT - 2007 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2007
T2 - 2007 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2007
Y2 - 7 October 2007 through 10 October 2007
ER -