TY - GEN
T1 - Cloud Privacy Beyond Legal Compliance
T2 - 2024 IEEE Cloud Summit, Cloud Summit 2024
AU - Kilhoffer, Zachary
AU - Bashir, Masooda
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - By implementing standards and becoming certified, organizations can demonstrate good practices and trustworthiness. However, privacy standards are relatively immature, and the pri-vacy research community rarely examines the individual controls of organizational standards (e.g., ISO 27017, SOC-2), which are what concretely implements privacy principles. It is also very time-consuming to monitor evolving standards, assess relevance and usefulness in a given context, and whether the effort and expense of becoming certified makes sense. In this paper, we propose an exploratory method leveraging a large language model (LLM) to analyze privacy documents. We created a dataset of controls (n = 1,511) from all nine standards identified as certifiable, cloud relevant, and privacy relevant. We fine-tuned BERT, a popular baseline LLM, to optimize performance on privacy standards. Finally, we performed topic modeling to better understand how the standards address privacy challenges and compare to one another. We demonstrate that controls can be grouped into 11 topics (e.g., "PII Management", "Continuous Monitoring and Auditing in Cloud"). Most standards seem to strongly emphasize the security and risk angles of privacy rather than rights and control over data. The results suggest efforts to standardize privacy practices are still nascent - more time, practice, and theoretical agreement is required before privacy standards approach the rigor of their security counterparts. By providing our fine-tuned model, coding pipeline, and method, we demonstrate the utility of this approach to better compare and understand privacy standards and other documen-tation for assessment and refining.
AB - By implementing standards and becoming certified, organizations can demonstrate good practices and trustworthiness. However, privacy standards are relatively immature, and the pri-vacy research community rarely examines the individual controls of organizational standards (e.g., ISO 27017, SOC-2), which are what concretely implements privacy principles. It is also very time-consuming to monitor evolving standards, assess relevance and usefulness in a given context, and whether the effort and expense of becoming certified makes sense. In this paper, we propose an exploratory method leveraging a large language model (LLM) to analyze privacy documents. We created a dataset of controls (n = 1,511) from all nine standards identified as certifiable, cloud relevant, and privacy relevant. We fine-tuned BERT, a popular baseline LLM, to optimize performance on privacy standards. Finally, we performed topic modeling to better understand how the standards address privacy challenges and compare to one another. We demonstrate that controls can be grouped into 11 topics (e.g., "PII Management", "Continuous Monitoring and Auditing in Cloud"). Most standards seem to strongly emphasize the security and risk angles of privacy rather than rights and control over data. The results suggest efforts to standardize privacy practices are still nascent - more time, practice, and theoretical agreement is required before privacy standards approach the rigor of their security counterparts. By providing our fine-tuned model, coding pipeline, and method, we demonstrate the utility of this approach to better compare and understand privacy standards and other documen-tation for assessment and refining.
KW - certification
KW - controls
KW - privacy
KW - security
KW - standards
UR - http://www.scopus.com/inward/record.url?scp=85202430372&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85202430372&partnerID=8YFLogxK
U2 - 10.1109/Cloud-Summit61220.2024.00020
DO - 10.1109/Cloud-Summit61220.2024.00020
M3 - Conference contribution
AN - SCOPUS:85202430372
T3 - Proceeding - 2024 IEEE Cloud Summit, Cloud Summit 2024
SP - 79
EP - 86
BT - Proceeding - 2024 IEEE Cloud Summit, Cloud Summit 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 27 June 2024 through 28 June 2024
ER -