RAB: Provable Robustness Against Backdoor Attacks

Maurice Weber, Xiaojun Xu, Bojan Karlaš, Ce Zhang, Bo Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent studies have shown that deep neural net-works (DNNs) are vulnerable to adversarial attacks, including evasion and backdoor (poisoning) attacks. On the defense side, there have been intensive efforts on improving both empirical and provable robustness against evasion attacks; however, the provable robustness against backdoor attacks still remains largely unexplored. In this paper, we focus on certifying the machine learning model robustness against general threat models, especially backdoor attacks. We first provide a unified framework via randomized smoothing techniques and show how it can be instantiated to certify the robustness against both evasion and backdoor attacks. We then propose the first robust training process, RAB, to smooth the trained model and certify its robustness against backdoor attacks. We theoretically prove the robustness bound for machine learning models trained with RAB and prove that our robustness bound is tight. In addition, we theoretically show that it is possible to train the robust smoothed models efficiently for simple models such as K-nearest neighbor classifiers, and we propose an exact smooth-training algorithm that eliminates the need to sample from a noise distribution for such models. Empirically, we conduct comprehensive experiments for different machine learning (ML) models such as DNNs, support vector machines, and K-NN models on MNIST, CIFAR-10, and ImageNette datasets and provide the first benchmark for certified robustness against backdoor attacks. In addition, we evaluate K-NN models on a spambase tabular dataset to demonstrate the advantages of the proposed exact algorithm. Both the theoretic analysis and the comprehensive evaluation on diverse ML models and datasets shed light on further robust learning strategies against general training time attacks.

Original languageEnglish (US)
Title of host publicationProceedings - 44th IEEE Symposium on Security and Privacy, SP 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1311-1328
Number of pages18
ISBN (Electronic)9781665493369
DOIs
StatePublished - 2023
Event44th IEEE Symposium on Security and Privacy, SP 2023 - Hybrid, San Francisco, United States
Duration: May 22 2023May 25 2023

Publication series

NameProceedings - IEEE Symposium on Security and Privacy
Volume2023-May
ISSN (Print)1081-6011

Conference

Conference44th IEEE Symposium on Security and Privacy, SP 2023
Country/TerritoryUnited States
CityHybrid, San Francisco
Period5/22/235/25/23

Keywords

  • Backdoor-attacks
  • Certified-robustness
  • Machine-learning-robustness

ASJC Scopus subject areas

  • Safety, Risk, Reliability and Quality
  • Software
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'RAB: Provable Robustness Against Backdoor Attacks'. Together they form a unique fingerprint.

Cite this