TY - GEN
T1 - POSTER
T2 - 19th ACM Asia Conference on Computer and Communications Security, AsiaCCS 2024
AU - Jiang, Fengqing
AU - Xu, Zhangchen
AU - Niu, Luyao
AU - Wang, Boxin
AU - Jia, Jinyuan
AU - Li, Bo
AU - Poovendran, Radha
N1 - This work is partially supported by the Air Force Office of Scientific Research (AFOSR) under grant FA9550-23-1-0208, National Science Foundation (NSF) under grants No.1910100, No.2046726, No. 2229876, DARPA GARD, the National Aeronautics and Space Administration (NASA) under grant No.80NSSC20M0229, Alfred P. Sloan Fellowship, Office of Naval Research (ONR) under grant N00014-23-1-2386, and the Amazon research award. This work is supported in part by funds provided by the National Science Foundation, by the Department of Homeland Security, and by IBM. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation or its federal agency and industry partners.
PY - 2024/7/1
Y1 - 2024/7/1
N2 - Compared with the traditional usage of large language models (LLMs) where users directly send queries to an LLM, LLM-integrated applications serve as middleware to refine users’ queries with domain-specific knowledge to better inform LLMs and enhance the responses. However, LLM-integrated applications also introduce new attack surfaces. This work considers a setup where the user and LLM interact via an application in the middle. We focus on the interactions that begin with user’s queries and end with LLM-integrated application returning responses to the queries, powered by LLMs at the service backend. We identify potential high-risk vulnerabilities in this setting that can originate from the malicious application developer or from an outsider threat initiator that can control the database access, manipulate and poison high-risk data for the user. Successful exploits of the identified vulnerabilities result in the users receiving responses tailored to the intent of a threat initiator. We assess such threats against LLM-integrated applications empowered by GPT-3.5 and GPT-4. Our experiments show that the threats can effectively bypass the restrictions and moderation policies of OpenAI, resulting in users exposing to the risk of bias, toxic content, privacy, and disinformation. We develop a lightweight, threat-agnostic defense to mitigate insider and outsider threats. Our evaluations demonstrate the efficacy of our defense.
AB - Compared with the traditional usage of large language models (LLMs) where users directly send queries to an LLM, LLM-integrated applications serve as middleware to refine users’ queries with domain-specific knowledge to better inform LLMs and enhance the responses. However, LLM-integrated applications also introduce new attack surfaces. This work considers a setup where the user and LLM interact via an application in the middle. We focus on the interactions that begin with user’s queries and end with LLM-integrated application returning responses to the queries, powered by LLMs at the service backend. We identify potential high-risk vulnerabilities in this setting that can originate from the malicious application developer or from an outsider threat initiator that can control the database access, manipulate and poison high-risk data for the user. Successful exploits of the identified vulnerabilities result in the users receiving responses tailored to the intent of a threat initiator. We assess such threats against LLM-integrated applications empowered by GPT-3.5 and GPT-4. Our experiments show that the threats can effectively bypass the restrictions and moderation policies of OpenAI, resulting in users exposing to the risk of bias, toxic content, privacy, and disinformation. We develop a lightweight, threat-agnostic defense to mitigate insider and outsider threats. Our evaluations demonstrate the efficacy of our defense.
UR - https://www.scopus.com/pages/publications/85199274156
UR - https://www.scopus.com/inward/citedby.url?scp=85199274156&partnerID=8YFLogxK
U2 - 10.1145/3634737.3659433
DO - 10.1145/3634737.3659433
M3 - Conference contribution
AN - SCOPUS:85199274156
T3 - ACM AsiaCCS 2024 - Proceedings of the 19th ACM Asia Conference on Computer and Communications Security
SP - 1949
EP - 1951
BT - ACM AsiaCCS 2024 - Proceedings of the 19th ACM Asia Conference on Computer and Communications Security
PB - Association for Computing Machinery
Y2 - 1 July 2024 through 5 July 2024
ER -