TY - GEN
T1 - RAB
T2 - 44th IEEE Symposium on Security and Privacy, SP 2023
AU - Weber, Maurice
AU - Xu, Xiaojun
AU - Karlaš, Bojan
AU - Zhang, Ce
AU - Li, Bo
N1 - Funding Information:
Another limitation is the defender’s knowledge of the attack. Indeed, to certify the robustness, the defender needs to know 1) an upper bound on the backdoor trigger magnitude (in terms of an Lp norm), 2) an upper bound on the number of poisoned training instances, and, 3) control over the training process. However, to use RAB only as a defense (i.e. without any certificate), the defender only needs to control the training process while 1) and 2) are not needed. The assumption 3) restricts RAB to be a robust training algorithm given an untrusted dataset. In other words, RAB cannot be used to defend against backdoor attacks that interfere with the training process (e.g., [38]). 10. Discussion and Conclusion In this paper, we aim to propose a unified smoothing framework to certify the model robustness against different attacks. In particular, towards the popular backdoor poisoning attacks, we propose the first robust smoothing pipeline RAB as well as a model deterministic test-time augmentation mechanism to certify the prediction robustness against diverse backdoor attacks. In addition, we propose an exact algorithm for KNN models without requiring to sample from the smoothing noise distributions. We provide comprehensive benchmarks of certified model robustness against backdoors on diverse datasets, which we believe will provide the first set of certified robustness against backdoor attacks for future work to compare with, and hopefully our results and analysis will inspire a new line of research on tighter certified accuracy against backdoor attacks. Acknowledgement This work is partially supported by NSF grant No.1910100, NSF CNS 2046726, C3 AI, and the Alfred P. Sloan Foundation. CZ and the DS3Lab gratefully acknowledge the support from the Swiss State Secretariat for Education, Research and Innovation (SERI) under contract number MB22.00036 (for European Research Council (ERC) Starting Grant TRIDENT 101042665), the Swiss National Science Foundation (Project Number 200021 184628, and 197485), Innosuisse/SNF BRIDGE Discovery (Project Number 40B2-0 187132), European Union Horizon 2020 Research and Innovation Programme (DAPHNE, 957407), Botnar Research Centre for Child Health, Swiss Data Science Center, Alibaba, Cisco, eBay, Google Focused Research Awards, Kuaishou Inc., Oracle Labs, Zurich Insurance, and the Department of Computer Science at ETH Zurich.
Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Recent studies have shown that deep neural net-works (DNNs) are vulnerable to adversarial attacks, including evasion and backdoor (poisoning) attacks. On the defense side, there have been intensive efforts on improving both empirical and provable robustness against evasion attacks; however, the provable robustness against backdoor attacks still remains largely unexplored. In this paper, we focus on certifying the machine learning model robustness against general threat models, especially backdoor attacks. We first provide a unified framework via randomized smoothing techniques and show how it can be instantiated to certify the robustness against both evasion and backdoor attacks. We then propose the first robust training process, RAB, to smooth the trained model and certify its robustness against backdoor attacks. We theoretically prove the robustness bound for machine learning models trained with RAB and prove that our robustness bound is tight. In addition, we theoretically show that it is possible to train the robust smoothed models efficiently for simple models such as K-nearest neighbor classifiers, and we propose an exact smooth-training algorithm that eliminates the need to sample from a noise distribution for such models. Empirically, we conduct comprehensive experiments for different machine learning (ML) models such as DNNs, support vector machines, and K-NN models on MNIST, CIFAR-10, and ImageNette datasets and provide the first benchmark for certified robustness against backdoor attacks. In addition, we evaluate K-NN models on a spambase tabular dataset to demonstrate the advantages of the proposed exact algorithm. Both the theoretic analysis and the comprehensive evaluation on diverse ML models and datasets shed light on further robust learning strategies against general training time attacks.
AB - Recent studies have shown that deep neural net-works (DNNs) are vulnerable to adversarial attacks, including evasion and backdoor (poisoning) attacks. On the defense side, there have been intensive efforts on improving both empirical and provable robustness against evasion attacks; however, the provable robustness against backdoor attacks still remains largely unexplored. In this paper, we focus on certifying the machine learning model robustness against general threat models, especially backdoor attacks. We first provide a unified framework via randomized smoothing techniques and show how it can be instantiated to certify the robustness against both evasion and backdoor attacks. We then propose the first robust training process, RAB, to smooth the trained model and certify its robustness against backdoor attacks. We theoretically prove the robustness bound for machine learning models trained with RAB and prove that our robustness bound is tight. In addition, we theoretically show that it is possible to train the robust smoothed models efficiently for simple models such as K-nearest neighbor classifiers, and we propose an exact smooth-training algorithm that eliminates the need to sample from a noise distribution for such models. Empirically, we conduct comprehensive experiments for different machine learning (ML) models such as DNNs, support vector machines, and K-NN models on MNIST, CIFAR-10, and ImageNette datasets and provide the first benchmark for certified robustness against backdoor attacks. In addition, we evaluate K-NN models on a spambase tabular dataset to demonstrate the advantages of the proposed exact algorithm. Both the theoretic analysis and the comprehensive evaluation on diverse ML models and datasets shed light on further robust learning strategies against general training time attacks.
KW - Backdoor-attacks
KW - Certified-robustness
KW - Machine-learning-robustness
UR - http://www.scopus.com/inward/record.url?scp=85162844807&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85162844807&partnerID=8YFLogxK
U2 - 10.1109/SP46215.2023.10179451
DO - 10.1109/SP46215.2023.10179451
M3 - Conference contribution
AN - SCOPUS:85162844807
T3 - Proceedings - IEEE Symposium on Security and Privacy
SP - 1311
EP - 1328
BT - Proceedings - 44th IEEE Symposium on Security and Privacy, SP 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 22 May 2023 through 25 May 2023
ER -