TY - JOUR
T1 - Pushing convertible constraints in frequent itemset mining
AU - Pei, Jian
AU - Han, Jiawei
AU - Lakshmanan, Laks V.S.
N1 - Funding Information:
∗The work was supported in part by grants from the Natural Sciences and Engineering Research Council of Canada, and the Networks of Centres of Excellence of Canada (NCE/IRIS-3). A preliminary version of the paper has been published by the authors as Pei et al., “Mining Frequent Itemsets with Convertible Constraints,” in the Proceedings of 2001 IEEE International Conference on Data Engineering (ICDE’01), Heidelberg, Germany, April 2001, pp. 433–332.
PY - 2004/5
Y1 - 2004/5
N2 - Recent work has highlighted the importance of the constraint-based mining paradigm in the context of frequent itemsets, associations, correlations, sequential patterns, and many other interesting patterns in large databases. Constraint pushing techniques have been developed for mining frequent patterns and associations with antimonotonic, monotonic, and succinct constraints. In this paper, we study constraints which cannot be handled with existing theory and techniques in frequent pattern mining. For example, avg(S)θv, median(S)θv, sum(S)θv (S can contain items of arbitrary values, θ ε {>, <, ≤, ≥} and v is a real number.) are customarily regarded as "tough" constraints in that they cannot be pushed inside an algorithm such as Apriori. We develop a notion of convertible constraints and systematically analyze, classify, and characterize this class. We also develop techniques which enable them to be readily pushed deep inside the recently developed FP-growth algorithm for frequent itemset mining. Results from our detailed experiments show the effectiveness of the techniques developed.
AB - Recent work has highlighted the importance of the constraint-based mining paradigm in the context of frequent itemsets, associations, correlations, sequential patterns, and many other interesting patterns in large databases. Constraint pushing techniques have been developed for mining frequent patterns and associations with antimonotonic, monotonic, and succinct constraints. In this paper, we study constraints which cannot be handled with existing theory and techniques in frequent pattern mining. For example, avg(S)θv, median(S)θv, sum(S)θv (S can contain items of arbitrary values, θ ε {>, <, ≤, ≥} and v is a real number.) are customarily regarded as "tough" constraints in that they cannot be pushed inside an algorithm such as Apriori. We develop a notion of convertible constraints and systematically analyze, classify, and characterize this class. We also develop techniques which enable them to be readily pushed deep inside the recently developed FP-growth algorithm for frequent itemset mining. Results from our detailed experiments show the effectiveness of the techniques developed.
KW - Algorithm
KW - Constraint
KW - Convertible constraint
KW - Frequent itemset mining
KW - Pruning
UR - http://www.scopus.com/inward/record.url?scp=3543095068&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=3543095068&partnerID=8YFLogxK
U2 - 10.1023/B:DAMI.0000023674.74932.4c
DO - 10.1023/B:DAMI.0000023674.74932.4c
M3 - Article
AN - SCOPUS:3543095068
SN - 1384-5810
VL - 8
SP - 227
EP - 252
JO - Data Mining and Knowledge Discovery
JF - Data Mining and Knowledge Discovery
IS - 3
ER -