TY - GEN
T1 - Policylint
T2 - 28th USENIX Security Symposium
AU - Andow, Benjamin
AU - Mahmud, Samin Yaseer
AU - Wang, Wenyu
AU - Whitaker, Justin
AU - Enck, William
AU - Reaves, Bradley
AU - Singh, Kapil
AU - Xie, Tao
N1 - Publisher Copyright:
© 2019 by The USENIX Association. All rights reserved.
PY - 2019
Y1 - 2019
N2 - Privacy policies are the primary mechanism by which companies inform users about data collection and sharing practices. To help users better understand these long and complex legal documents, recent research has proposed tools that summarize collection and sharing. However, these tools have a significant oversight: they do not account for contradictions that may occur within an individual policy. In this paper, we present PolicyLint, a privacy policy analysis tool that identifies such contradictions by simultaneously considering negation and varying semantic levels of data objects and entities. To do so, PolicyLint automatically generates ontologies from a large corpus of privacy policies and uses sentence-level natural language processing to capture both positive and negative statements of data collection and sharing. We use PolicyLint to analyze the policies of 11,430 apps and find that 14.2% of these policies contain contradictions that may be indicative of misleading statements. We manually verify 510 contradictions, identifying concerning trends that include the use of misleading presentation, attempted redefinition of common understandings of terms, conflicts in regulatory definitions (e.g., US and EU), and “laundering” of tracking information facilitated by sharing or collecting data that can be used to derive sensitive information. In doing so, PolicyLint significantly advances automated analysis of privacy policies.
AB - Privacy policies are the primary mechanism by which companies inform users about data collection and sharing practices. To help users better understand these long and complex legal documents, recent research has proposed tools that summarize collection and sharing. However, these tools have a significant oversight: they do not account for contradictions that may occur within an individual policy. In this paper, we present PolicyLint, a privacy policy analysis tool that identifies such contradictions by simultaneously considering negation and varying semantic levels of data objects and entities. To do so, PolicyLint automatically generates ontologies from a large corpus of privacy policies and uses sentence-level natural language processing to capture both positive and negative statements of data collection and sharing. We use PolicyLint to analyze the policies of 11,430 apps and find that 14.2% of these policies contain contradictions that may be indicative of misleading statements. We manually verify 510 contradictions, identifying concerning trends that include the use of misleading presentation, attempted redefinition of common understandings of terms, conflicts in regulatory definitions (e.g., US and EU), and “laundering” of tracking information facilitated by sharing or collecting data that can be used to derive sensitive information. In doing so, PolicyLint significantly advances automated analysis of privacy policies.
UR - http://www.scopus.com/inward/record.url?scp=85076357196&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85076357196&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85076357196
T3 - Proceedings of the 28th USENIX Security Symposium
SP - 585
EP - 602
BT - Proceedings of the 28th USENIX Security Symposium
PB - USENIX Association
Y2 - 14 August 2019 through 16 August 2019
ER -