A Framework for Projected Clustering of High Dimensional Data Streams

Charu C. Aggarwal, Philip S. Yu, Jiawei Han, Jianyong Wang

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

This chapter presents a new framework, HPStream, for high-dimensional projected clustering of data streams. It finds projected clusters in particular subsets of the dimensions by maintaining condensed representations of the clusters over time. The algorithm provides better quality clusters than full dimensional data stream clustering algorithms. The chapter analyzes the algorithm on a number of real and synthetic data sets. In each case, it is found that the HPStream algorithm is more effective than the full dimensional CluStream algorithm. High-dimensional projected clustering of data streams opens a new direction for exploration of stream data mining. With this methodology, one can treat projected clustering as a preprocessing step that may promote more effective methods for stream classification, similarity, evolution, and outlier analysis.

Original languageEnglish (US)
Title of host publicationProceedings 2004 VLDB Conference
Subtitle of host publicationThe 30th International Conference on Very Large Databases (VLDB)
PublisherElsevier
Pages852-863
Number of pages12
ISBN (Electronic)9780120884698
DOIs
StatePublished - Jan 1 2004
Externally publishedYes

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'A Framework for Projected Clustering of High Dimensional Data Streams'. Together they form a unique fingerprint.

Cite this