LLM-Mod: Can Large Language Models Assist Content Moderation?

Mahi Kolla, Siddharth Salunkhe, Eshwar Chandrasekharan, Koustuv Saha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Content moderation is critical for maintaining healthy online spaces. However, it remains a predominantly manual task. Moderators are often exhausted by low moderator-to-posts ratio. Researchers have been exploring computational tools to assist human moderators. The natural language understanding capabilities of large language models (LLMs) open up possibilities to use LLMs for online moderation. This work explores the feasibility of using LLMs to identify rule violations on Reddit.We examine howan LLM-based moderator (LLM-Mod) reasons about 744 posts across 9 subreddits that violate different types of rules. We find that while LLM-Mod has a good true-negative rate (92.3%), it has a bad true-positive rate (43.1%), performing poorly when flagging rule-violating posts. LLM-Mod is likely to flag keyword-matching-based rule violations, but cannot reason about posts with higher complexity. We discuss the considerations for integrating LLMs into content moderation workflows and designing platforms that support both AI-driven and human-in-the-loop moderation.

Original languageEnglish (US)
Title of host publicationCHI 2024 - Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Sytems
PublisherAssociation for Computing Machinery
ISBN (Electronic)9798400703317
DOIs
StatePublished - May 11 2024
Event2024 CHI Conference on Human Factors in Computing Sytems, CHI EA 2024 - Hybrid, Honolulu, United States
Duration: May 11 2024May 16 2024

Publication series

NameConference on Human Factors in Computing Systems - Proceedings

Conference

Conference2024 CHI Conference on Human Factors in Computing Sytems, CHI EA 2024
Country/TerritoryUnited States
CityHybrid, Honolulu
Period5/11/245/16/24

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Graphics and Computer-Aided Design
  • Software

Fingerprint

Dive into the research topics of 'LLM-Mod: Can Large Language Models Assist Content Moderation?'. Together they form a unique fingerprint.

Cite this