TY - GEN
T1 - Towards Universal Adversarial Examples and Defenses
AU - Rakin, Adnan Siraj
AU - Wang, Ye
AU - Aeron, Shuchin
AU - Koike-Akino, Toshiaki
AU - Moulin, Pierre
AU - Parsons, Kieran
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Adversarial examples have recently exposed the severe vulnerability of neural network models. However, most of the existing attacks require some form of target model information (i.e., weights/model inquiry/architecture) to improve the efficacy of the attack. We leverage the information-theoretic connections between robust learning and generalized rate-distortion theory to formulate a universal adversarial example (UAE) generation algorithm. Our algorithm trains an offline adversarial generator to minimize the mutual information between the label and perturbed data. At the inference phase, our UAE method can efficiently generate effective adversarial examples without high computation cost. These adversarial examples in turn allow for developing universal defenses through adversarial training. Our experiments demonstrate promising gains in improving the training efficiency of conventional adversarial training.
AB - Adversarial examples have recently exposed the severe vulnerability of neural network models. However, most of the existing attacks require some form of target model information (i.e., weights/model inquiry/architecture) to improve the efficacy of the attack. We leverage the information-theoretic connections between robust learning and generalized rate-distortion theory to formulate a universal adversarial example (UAE) generation algorithm. Our algorithm trains an offline adversarial generator to minimize the mutual information between the label and perturbed data. At the inference phase, our UAE method can efficiently generate effective adversarial examples without high computation cost. These adversarial examples in turn allow for developing universal defenses through adversarial training. Our experiments demonstrate promising gains in improving the training efficiency of conventional adversarial training.
UR - http://www.scopus.com/inward/record.url?scp=85123433892&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85123433892&partnerID=8YFLogxK
U2 - 10.1109/ITW48936.2021.9611439
DO - 10.1109/ITW48936.2021.9611439
M3 - Conference contribution
AN - SCOPUS:85123433892
T3 - 2021 IEEE Information Theory Workshop, ITW 2021 - Proceedings
BT - 2021 IEEE Information Theory Workshop, ITW 2021 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE Information Theory Workshop, ITW 2021
Y2 - 17 October 2021 through 21 October 2021
ER -