Generalization and decision tree induction: efficient classification in data mining

Micheline Kamber, Lara Winstone, Wan Gong, Shan Cheng, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Efficiency and scalability are fundamental issues concerning data mining in large databases. Although classification has been studied extensively, few of the known methods take serious consideration of efficient induction in large databases and the analysis of data at multiple abstraction levels. This paper addresses the efficiency and scalability issues by proposing a data classification method which integrates attribute-oriented induction, relevance analysis, and the induction of decision trees. Such an integration leads to efficient, high-quality, multiple-level classification of large amounts of data, the relaxation of the requirement of perfect training sets, and the elegant handling of continuous and noisy data.

Original languageEnglish (US)
Title of host publicationProceedings of the IEEE International Workshop on Research Issues in Data Engineering
EditorsP. Scheuermann
PublisherIEEE
Pages111-120
Number of pages10
StatePublished - 1997
Externally publishedYes
EventProceedings of the 1997 7th International Workshop on Research Issues in Data Engineering, RIDE'97 - Birmingham, UK
Duration: Apr 7 1997Apr 8 1997

Other

OtherProceedings of the 1997 7th International Workshop on Research Issues in Data Engineering, RIDE'97
CityBirmingham, UK
Period4/7/974/8/97

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software
  • Engineering (miscellaneous)

Fingerprint

Dive into the research topics of 'Generalization and decision tree induction: efficient classification in data mining'. Together they form a unique fingerprint.

Cite this