Abstract
We present a decision-tree-based symbolic rule induction system for categorizing text documents automatically. Our method for rule induction involves the novel combination of (1) a fast decision tree induction algorithm especially suited to text data and (2) a new method for converting a decision tree to a rule set that is simplified, but still logically equivalent to, the original tree. We report experimental results on the use of this system on some practical problems.
Original language | English (US) |
---|---|
Pages (from-to) | 428-437 |
Number of pages | 10 |
Journal | IBM Systems Journal |
Volume | 41 |
Issue number | 3 |
DOIs | |
State | Published - 2002 |
Externally published | Yes |
ASJC Scopus subject areas
- Software
- Theoretical Computer Science
- General Computer Science
- Information Systems
- Computer Graphics and Computer-Aided Design
- Computational Theory and Mathematics