Spanning attack: reinforce black-box attacks with unlabeled data

Lu Wang, Huan Zhang, Jinfeng Yi, Cho Jui Hsieh, Yuan Jiang

Research output: Contribution to journalArticlepeer-review

Abstract

Adversarial black-box attacks aim to craft adversarial perturbations by querying input–output pairs of machine learning models. They are widely used to evaluate the robustness of pre-trained models. However, black-box attacks often suffer from the issue of query inefficiency due to the high dimensionality of the input space, and therefore incur a false sense of model robustness. In this paper, we relax the conditions of the black-box threat model, and propose a novel technique called the spanning attack. By constraining adversarial perturbations in a low-dimensional subspace via spanning an auxiliary unlabeled dataset, the spanning attack significantly improves the query efficiency of a wide variety of existing black-box attacks. Extensive experiments show that the proposed method works favorably in both soft-label and hard-label black-box attacks.

Original languageEnglish (US)
Pages (from-to)2349-2368
Number of pages20
JournalMachine Learning
Volume109
Issue number12
DOIs
StatePublished - Dec 2020
Externally publishedYes

Keywords

  • Adversarial machine learning
  • Adversarial robustness
  • Black-box attacks
  • Query efficiency

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Spanning attack: reinforce black-box attacks with unlabeled data'. Together they form a unique fingerprint.

Cite this