PrivacyChat: Utilizing Large Language Model for Fine-Grained Information Extraction over Privacy Policies

Rohan Charudatt Salvi, Catherine Blake, Masooda Bahir

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Privacy policies play a crucial role in upholding the privacy rights of users and fostering trust between organizations and their users. By clearly understanding the terms and conditions of a privacy policy, individuals can make well-informed choices about disclosing their personal information and understand how the concerned entity will manage their data. Following the introduction of the General Data Protection Regulation, these policies have become more extensive and intricate. This creates a challenge for users in terms of understanding and finding specific information in the policy. Today, through prompt-based methods, we can extract specific data from extensive text documents using large language models (LLMs), thus eliminating the need for training or fine-tuning models. In this study, we explore a prompt-based approach to extract information concerning personal data from privacy policies using a large language model, GPT-3.5. In this preliminary study, we assess the performance of GPT-3.5 on such a fine-grained extraction task through varied metrics and its capability to address previous computational challenges. The prompt structure can be adapted for other LLMs, and a similar approach can be employed for various information extraction tasks over privacy policies. The data and code are available at our GitHub repository. .

Original languageEnglish (US)
Title of host publicationWisdom, Well-Being, Win-Win - 19th International Conference, iConference 2024, Proceedings
EditorsIsaac Sserwanga, Hideo Joho, Jie Ma, Preben Hansen, Dan Wu, Masanori Koizumi, Anne J. Gilliland
PublisherSpringer
Pages223-231
Number of pages9
ISBN (Print)9783031578496
DOIs
StatePublished - 2024
Event19th International Conference on Wisdom, Well-Being, Win-Win, iConference 2024 - Changchun, China
Duration: Apr 15 2024Apr 26 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14596 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th International Conference on Wisdom, Well-Being, Win-Win, iConference 2024
Country/TerritoryChina
CityChangchun
Period4/15/244/26/24

Keywords

  • Information Extraction
  • Large Language Models
  • Privacy Policy

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'PrivacyChat: Utilizing Large Language Model for Fine-Grained Information Extraction over Privacy Policies'. Together they form a unique fingerprint.

Cite this