Abstract
Efficiency and scalability are fundamental issues concerning data mining in large databases. Although classification has been studied extensively, few of the known methods take serious consideration of efficient induction in large databases and the analysis of data at multiple abstraction levels. This paper addresses the efficiency and scalability issues by proposing a data classification method which integrates attribute-oriented induction, relevance analysis, and the induction of decision trees. Such an integration leads to efficient, high-quality, multiple-level classification of large amounts of data, the relaxation of the requirement of perfect training sets, and the elegant handling of continuous and noisy data.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the IEEE International Workshop on Research Issues in Data Engineering |
Editors | P. Scheuermann |
Publisher | IEEE |
Pages | 111-120 |
Number of pages | 10 |
State | Published - 1997 |
Externally published | Yes |
Event | Proceedings of the 1997 7th International Workshop on Research Issues in Data Engineering, RIDE'97 - Birmingham, UK Duration: Apr 7 1997 → Apr 8 1997 |
Other
Other | Proceedings of the 1997 7th International Workshop on Research Issues in Data Engineering, RIDE'97 |
---|---|
City | Birmingham, UK |
Period | 4/7/97 → 4/8/97 |
ASJC Scopus subject areas
- Hardware and Architecture
- Software
- Engineering (miscellaneous)