TY - GEN
T1 - Optimization of Constrained Frequent Set Queries with 2-variable Constraints
AU - Lakshmanan, Laks V.S.
AU - Ng, Raymond
AU - Han, Jiawei
AU - Pang, Alex
N1 - Publisher Copyright:
© 1999 ACM.
PY - 1999/6/1
Y1 - 1999/6/1
N2 - Currently, there is tremendous interest in providing ad-hoc mining capabilities in database management systems. As a first step towards this goal, in [15] we proposed an architecture for supporting constraint-based, human-centered, exploratory mining of various kinds of rules including associations, introduced the notion of constrained frequent set queries (CFQs), and developed effective pruning optimizations for CFQs with 1-variable (1-var) constraints.While 1-var constraints are useful for constraining the antecedent and consequent separately, many natural examples of CFQs illustrate the need for constraining the antecedent and consequent jointly, for which 2-variable (2-var) constraints are indispensable. Developing pruning optimizations for CFQs with 2-var constraints is the subject of this paper. But this is a difficult problem because: (i) in 2-var constraints, both variables keep changing and, unlike 1-var constraints, there is no fixed target for pruning; (ii) as we show, "conventional"monotonicity-based optimization techniques do not apply effectively to 2-var constraints.The contributions are as follows. (1) We introduce a notion of quasi-succinctness, which allows a quasi-succinct 2-var constraint to be reduced to two succinct 1-var constraints for pruning. (2) We characterize the class of 2-var constraints that are quasi-succinct. (3) We develop heuristic techniques for non-quasi-succinct constraints. Experimental results show the effectiveness of all our techniques. (4) We propose a query optimizer for CFQs and show that for a large class of constraints, the computation strategy generated by the optimizer is ccc-optimal, i.e., minimizing the effort incurred w.r.t. constraint checking and support counting.
AB - Currently, there is tremendous interest in providing ad-hoc mining capabilities in database management systems. As a first step towards this goal, in [15] we proposed an architecture for supporting constraint-based, human-centered, exploratory mining of various kinds of rules including associations, introduced the notion of constrained frequent set queries (CFQs), and developed effective pruning optimizations for CFQs with 1-variable (1-var) constraints.While 1-var constraints are useful for constraining the antecedent and consequent separately, many natural examples of CFQs illustrate the need for constraining the antecedent and consequent jointly, for which 2-variable (2-var) constraints are indispensable. Developing pruning optimizations for CFQs with 2-var constraints is the subject of this paper. But this is a difficult problem because: (i) in 2-var constraints, both variables keep changing and, unlike 1-var constraints, there is no fixed target for pruning; (ii) as we show, "conventional"monotonicity-based optimization techniques do not apply effectively to 2-var constraints.The contributions are as follows. (1) We introduce a notion of quasi-succinctness, which allows a quasi-succinct 2-var constraint to be reduced to two succinct 1-var constraints for pruning. (2) We characterize the class of 2-var constraints that are quasi-succinct. (3) We develop heuristic techniques for non-quasi-succinct constraints. Experimental results show the effectiveness of all our techniques. (4) We propose a query optimizer for CFQs and show that for a large class of constraints, the computation strategy generated by the optimizer is ccc-optimal, i.e., minimizing the effort incurred w.r.t. constraint checking and support counting.
UR - http://www.scopus.com/inward/record.url?scp=85134325545&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85134325545&partnerID=8YFLogxK
U2 - 10.1145/304182.304196
DO - 10.1145/304182.304196
M3 - Conference contribution
AN - SCOPUS:85134325545
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 157
EP - 168
BT - SIGMOD/PODS 1999 - Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data and Symposium on Principles of Database Systems
PB - Association for Computing Machinery
T2 - 1999 ACM SIGMOD International Conference on Management of Data and Symposium on Principles of Database Systems, SIGMOD/PODS 1999
Y2 - 31 May 1999 through 3 June 1999
ER -