Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks

Daniel Kang, Xuechen Li, Ion Stoica, Carlos Guestrin, Matei Zaharia, Tatsunori Hashimoto

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent advances in instruction-following large language models (LLMs) have led to dramatic improvements in a range of NLP tasks. Unfortunately, we find that the same improved capabilities amplify the dual-use risks for malicious purposes of these models. Dual-use is difficult to prevent as instruction-following capabilities now enable standard attacks from computer security. The capabilities of these instruction-following LLMs provide strong economic incentives for dual-use by malicious actors. In particular, we show that instruction-following LLMs can produce targeted malicious content, including hate speech and scams, bypassing in-the-wild defenses implemented by LLM API vendors. Our analysis shows that this content can be generated economically and at cost of 125-500 \times cheaper than human effort alone. Together, our findings suggest that LLMs will increasingly attract more sophisticated adversaries and attacks, and addressing these attacks may require new approaches to mitigations.

Original languageEnglish (US)
Title of host publicationProceedings - 45th IEEE Symposium on Security and Privacy Workshops, SPW 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages132-143
Number of pages12
ISBN (Electronic)9798350354874
DOIs
StatePublished - 2024
Event45th IEEE Symposium on Security and Privacy Workshops, SPW 2024 - San Francisco, United States
Duration: May 23 2024 → …

Publication series

NameProceedings - 45th IEEE Symposium on Security and Privacy Workshops, SPW 2024

Conference

Conference45th IEEE Symposium on Security and Privacy Workshops, SPW 2024
Country/TerritoryUnited States
CitySan Francisco
Period5/23/24 → …

ASJC Scopus subject areas

  • Communication
  • Artificial Intelligence
  • Computer Networks and Communications
  • Information Systems
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks'. Together they form a unique fingerprint.

Cite this