Evasion-robust classification on binary domains

Bo Li, Yevgeniy Vorobeychik

Research output: Contribution to journalArticlepeer-review

Abstract

The success of classification learning has led to numerous attempts to apply it in adversarial settings such as spam and malware detection. The core challenge in this class of applications is that adversaries are not static, but make a deliberate effort to evade the classifiers. We investigate both the problem of modeling the objectives of such adversaries, as well as the algorithmic problem of accounting for rational, objective-driven adversaries. We first present a general approach based on mixed-integer linear programming (MILP) with constraint generation. This approach is the first to compute an optimal solution to adversarial loss minimization for two general classes of adversarial evasion models in the context of binary feature spaces. To further improve scalability and significantly generalize the scope of the MILP-based method, we propose a principled iterative retraining framework, which can be used with arbitrary classifiers and essentially arbitrary attack models. We show that the retraining approach, when it converges, minimizes an upper bound on adversarial loss. Extensive experiments demonstrate that the mixed-integer programming approach significantly outperforms several state-of-the-art adversarial learning alternatives. Moreover, the retraining framework performs nearly as well, but scales significantly better. Finally, we show that our approach is robust to misspecifications of the adversarial model.

Original languageEnglish (US)
Article numbera50
JournalACM Transactions on Knowledge Discovery from Data
Volume12
Issue number4
DOIs
StatePublished - Jul 2018
Externally publishedYes

Keywords

  • Adversarial classification
  • Adversarial examples
  • Classifier evasion
  • Mixed-integer linear programming
  • Robust learning

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Evasion-robust classification on binary domains'. Together they form a unique fingerprint.

Cite this