Feature cross-substitution in adversarial classification

Bo Li, Yevgeniy Vorobeychik

Research output: Contribution to journalConference articlepeer-review

Abstract

The success of machine learning, particularly in supervised settings, has led to numerous attempts to apply it in adversarial settings such as spam and malware detection. The core challenge in this class of applications is that adversaries are not static data generators, but make a deliberate effort to evade the classifiers deployed to detect them. We investigate both the problem of modeling the objectives of such adversaries, as well as the algorithmic problem of accounting for rational, objective-driven adversaries. In particular, we demonstrate severe shortcomings of feature reduction in adversarial settings using several natural adversarial objective functions, an observation that is particularly pronounced when the adversary is able to substitute across similar features (for example, replace words with synonyms or replace letters in words). We offer a simple heuristic method for making learning more robust to feature cross-substitution attacks. We then present a more general approach based on mixed-integer linear programming with constraint generation, which implicitly trades off overfitting and feature selection in an adversarial setting using a sparse regularizer along with an evasion model. Our approach is the first method for combining an adversarial classification algorithm with a very general class of models of adversarial classifier evasion. We show that our algorithmic approach significantly outperforms state-of-the-art alternatives.

Original languageEnglish (US)
Pages (from-to)2087-2095
Number of pages9
JournalAdvances in Neural Information Processing Systems
Volume3
Issue numberJanuary
StatePublished - 2014
Externally publishedYes
Event28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 - Montreal, Canada
Duration: Dec 8 2014Dec 13 2014

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Feature cross-substitution in adversarial classification'. Together they form a unique fingerprint.

Cite this