TY - GEN
T1 - Double-cross attacks
T2 - 30th USENIX Security Symposium, USENIX Security 2021
AU - Sanchez Vicarte, Jose Rodrigo
AU - Wang, Gang
AU - Fletcher, Christopher W.
N1 - Funding Information:
Acknowledgments. This work was partially funded by NSF grants 1942888 and 2030521, an Intel ISRA, and an Amazon Research Award.
Publisher Copyright:
© 2021 by The USENIX Association. All rights reserved.
PY - 2021
Y1 - 2021
N2 - Active learning is widely used in data labeling services to support real-world machine learning applications. By selecting and labeling the samples that have the highest impact on model retraining, active learning can reduce labeling efforts, and thus reduce cost. In this paper, we present a novel attack called Double Cross, which aims to manipulate data labeling and model training in active learning settings. To perform a double-cross attack, the adversary crafts inputs with a special trigger pattern and sends the triggered inputs to the victim model retraining pipeline. The goals of the triggered inputs are (1) to get selected for labeling and retraining by the victim; (2) to subsequently mislead human annotators into assigning an adversary-selected label; and (3) to change the victim model's behavior after retraining occurs. After retraining, the attack causes the victim to mislabel any samples with this trigger pattern to the adversary-chosen label. At the same time, labeling other samples, without the trigger pattern, is not affected. We develop a trigger generation method that simultaneously achieves these three goals. We evaluate the attack on multiple existing image classifiers and demonstrate that both gray-box and black-box attacks are successful. Furthermore, we perform experiments on a real-world machine learning platform (Amazon SageMaker) to evaluate the attack with human annotators in the loop, to confirm the practicality of the attack. Finally, we discuss the implications of the results and the open research questions moving forward.
AB - Active learning is widely used in data labeling services to support real-world machine learning applications. By selecting and labeling the samples that have the highest impact on model retraining, active learning can reduce labeling efforts, and thus reduce cost. In this paper, we present a novel attack called Double Cross, which aims to manipulate data labeling and model training in active learning settings. To perform a double-cross attack, the adversary crafts inputs with a special trigger pattern and sends the triggered inputs to the victim model retraining pipeline. The goals of the triggered inputs are (1) to get selected for labeling and retraining by the victim; (2) to subsequently mislead human annotators into assigning an adversary-selected label; and (3) to change the victim model's behavior after retraining occurs. After retraining, the attack causes the victim to mislabel any samples with this trigger pattern to the adversary-chosen label. At the same time, labeling other samples, without the trigger pattern, is not affected. We develop a trigger generation method that simultaneously achieves these three goals. We evaluate the attack on multiple existing image classifiers and demonstrate that both gray-box and black-box attacks are successful. Furthermore, we perform experiments on a real-world machine learning platform (Amazon SageMaker) to evaluate the attack with human annotators in the loop, to confirm the practicality of the attack. Finally, we discuss the implications of the results and the open research questions moving forward.
UR - http://www.scopus.com/inward/record.url?scp=85114499583&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85114499583&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85114499583
T3 - Proceedings of the 30th USENIX Security Symposium
SP - 1593
EP - 1610
BT - Proceedings of the 30th USENIX Security Symposium
PB - USENIX Association
Y2 - 11 August 2021 through 13 August 2021
ER -