Provably robust deep learning via adversarially trained smoothed classifiers

Hadi Salman, Greg Yang, Jerry Li, Pengchuan Zhang, Huan Zhang, Ilya Razenshteyn, Sébastien Bubeck

Research output: Contribution to journalConference articlepeer-review

Abstract

Recent works have shown the effectiveness of randomized smoothing as a scalable technique for building neural network-based classifiers that are provably robust to `2-norm adversarial perturbations. In this paper, we employ adversarial training to improve the performance of randomized smoothing. We design an adapted attack for smoothed classifiers, and we show how this attack can be used in an adversarial training setting to boost the provable robustness of smoothed classifiers. We demonstrate through extensive experimentation that our method consistently outperforms all existing provably `2-robust classifiers by a significant margin on ImageNet and CIFAR-10, establishing the state-of-the-art for provable `2-defenses. Moreover, we find that pre-training and semi-supervised learning boost adversarially trained smoothed classifiers even further.

Original languageEnglish (US)
JournalAdvances in Neural Information Processing Systems
Volume32
StatePublished - 2019
Externally publishedYes
Event33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019 - Vancouver, Canada
Duration: Dec 8 2019Dec 14 2019

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Provably robust deep learning via adversarially trained smoothed classifiers'. Together they form a unique fingerprint.

Cite this