TY - JOUR
T1 - Robustness verification of tree-based models
AU - Chen, Hongge
AU - Zhang, Huan
AU - Si, Si
AU - Li, Yang
AU - Boning, Duane
AU - Hsieh, Cho Jui
N1 - Funding Information:
Acknowledgement. Chen and Boning acknowledge the support of SenseTime. Hsieh acknowledges the support of NSF IIS-1719097 and Intel faculty award.
Publisher Copyright:
© 2019 Neural information processing systems foundation. All rights reserved.
PY - 2019
Y1 - 2019
N2 - We study the robustness verification problem for tree based models, including decision trees, random forests (RFs) and gradient boosted decision trees (GBDTs). Formal robustness verification of decision tree ensembles involves finding the exact minimal adversarial perturbation or a guaranteed lower bound of it. Existing approaches find the minimal adversarial perturbation by a mixed integer linear programming (MILP) problem, which takes exponential time so is impractical for large ensembles. Although this verification problem is NP-complete in general, we give a more precise complexity characterization. We show that there is a simple linear time algorithm for verifying a single tree, and for tree ensembles the verification problem can be cast as a max-clique problem on a multi-partite graph with bounded boxicity. For low dimensional problems when boxicity can be viewed as constant, this reformulation leads to a polynomial time algorithm. For general problems, by exploiting the boxicity of the graph, we develop an efficient multi-level verification algorithm that can give tight lower bounds on robustness of decision tree ensembles, while allowing iterative improvement and any-time termination. On RF/GBDT models trained on 10 datasets, our algorithm is hundreds of times faster than a previous approach that requires solving MILPs, and is able to give tight robustness verification bounds on large GBDTs with hundreds of deep trees.
AB - We study the robustness verification problem for tree based models, including decision trees, random forests (RFs) and gradient boosted decision trees (GBDTs). Formal robustness verification of decision tree ensembles involves finding the exact minimal adversarial perturbation or a guaranteed lower bound of it. Existing approaches find the minimal adversarial perturbation by a mixed integer linear programming (MILP) problem, which takes exponential time so is impractical for large ensembles. Although this verification problem is NP-complete in general, we give a more precise complexity characterization. We show that there is a simple linear time algorithm for verifying a single tree, and for tree ensembles the verification problem can be cast as a max-clique problem on a multi-partite graph with bounded boxicity. For low dimensional problems when boxicity can be viewed as constant, this reformulation leads to a polynomial time algorithm. For general problems, by exploiting the boxicity of the graph, we develop an efficient multi-level verification algorithm that can give tight lower bounds on robustness of decision tree ensembles, while allowing iterative improvement and any-time termination. On RF/GBDT models trained on 10 datasets, our algorithm is hundreds of times faster than a previous approach that requires solving MILPs, and is able to give tight robustness verification bounds on large GBDTs with hundreds of deep trees.
UR - http://www.scopus.com/inward/record.url?scp=85090148825&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090148825&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85090148825
SN - 1049-5258
VL - 32
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
T2 - 33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019
Y2 - 8 December 2019 through 14 December 2019
ER -