TY - GEN
T1 - CMAR
T2 - 1st IEEE International Conference on Data Mining, ICDM'01
AU - Li, Wenmin
AU - Han, Jiawei
AU - Pei, Jian
PY - 2001
Y1 - 2001
N2 - Previous studies propose that associative classification has high classification accuracy and strong flexibility at handling unstructured data. However, it still suffers from the huge set of mined rules and sometimes biased classification or overfitting since the classification is based on only single high-confidence rule. In this study, we propose a new associative classification method, CMAR, i.e., Classification based on Multiple Association Rules. The method extends an efficient frequent pattern mining method, FP-growth, constructs a class distribution-associated FP-tree, and mines large database efficiently. Moreover, it applies a CR-tree structure to store and retrieve mined association rules efficiently, and prunes rules effectively based on confidence, correlation and database coverage. The classification is performed based on a weighted χ2 analysis using multiple strong association rules. Our extensive experiments on 26 databases from UCI machine learning database repository show that CMAR is consistent, highly effective at classification of various kinds of databases and has better average classification accuracy in comparison with CBA and C4.5. Moreover, our performance study shows that the method is highly efficient and scalable in comparison with other reported associative classification methods.
AB - Previous studies propose that associative classification has high classification accuracy and strong flexibility at handling unstructured data. However, it still suffers from the huge set of mined rules and sometimes biased classification or overfitting since the classification is based on only single high-confidence rule. In this study, we propose a new associative classification method, CMAR, i.e., Classification based on Multiple Association Rules. The method extends an efficient frequent pattern mining method, FP-growth, constructs a class distribution-associated FP-tree, and mines large database efficiently. Moreover, it applies a CR-tree structure to store and retrieve mined association rules efficiently, and prunes rules effectively based on confidence, correlation and database coverage. The classification is performed based on a weighted χ2 analysis using multiple strong association rules. Our extensive experiments on 26 databases from UCI machine learning database repository show that CMAR is consistent, highly effective at classification of various kinds of databases and has better average classification accuracy in comparison with CBA and C4.5. Moreover, our performance study shows that the method is highly efficient and scalable in comparison with other reported associative classification methods.
UR - http://www.scopus.com/inward/record.url?scp=78149313084&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78149313084&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:78149313084
SN - 0769511198
SN - 9780769511191
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 369
EP - 376
BT - Proceedings - 2001 IEEE International Conference on Data Mining, ICDM'01
Y2 - 29 November 2001 through 2 December 2001
ER -