TY - GEN
T1 - PropaInsight
T2 - 31st International Conference on Computational Linguistics, COLING 2025
AU - Liu, Jiateng
AU - Ai, Lin
AU - Liu, Zizhou
AU - Karisani, Payam
AU - Hui, Zheng
AU - Fung, May
AU - Nakov, Preslav
AU - Hirschberg, Julia
AU - Ji, Heng
N1 - This research was done with funding from the Defense Advanced Research Projects Agency (DARPA) under Contracts No. HR001120C0123 and HR0011-24-3-0325. The views, opinions, and/or findings expressed are those of the authors and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government. We also thank Kitware.Inc. and Rapidata.Inc. for their help in the data annotation process.
PY - 2025
Y1 - 2025
N2 - Propaganda plays a critical role in shaping public opinion and fueling disinformation. While existing research primarily focuses on identifying propaganda techniques, it lacks the ability to capture the broader motives and the impacts of such content. To address these challenges, we introduce PropaInsight, a conceptual framework grounded in foundational social science research, which systematically dissects propaganda into techniques, arousal appeals, and underlying intent. PropaInsight offers a more granular understanding of how propaganda operates across different contexts. Additionally, we present PropaGaze, a novel dataset that combines human-annotated data with high-quality synthetic data generated through a meticulously designed pipeline. Our experiments show that off-the-shelf LLMs struggle with propaganda analysis, but training with PropaGaze significantly improves performance. Fine-tuned Llama-7B-Chat achieves 203.4% higher text span IoU in technique identification and 66.2% higher BertScore in appeal analysis compared to 1-shot GPT-4-Turbo. Moreover, PropaGaze complements limited human-annotated data in data-sparse and cross-domain scenarios, showing its potential for comprehensive and generalizable propaganda analysis.
AB - Propaganda plays a critical role in shaping public opinion and fueling disinformation. While existing research primarily focuses on identifying propaganda techniques, it lacks the ability to capture the broader motives and the impacts of such content. To address these challenges, we introduce PropaInsight, a conceptual framework grounded in foundational social science research, which systematically dissects propaganda into techniques, arousal appeals, and underlying intent. PropaInsight offers a more granular understanding of how propaganda operates across different contexts. Additionally, we present PropaGaze, a novel dataset that combines human-annotated data with high-quality synthetic data generated through a meticulously designed pipeline. Our experiments show that off-the-shelf LLMs struggle with propaganda analysis, but training with PropaGaze significantly improves performance. Fine-tuned Llama-7B-Chat achieves 203.4% higher text span IoU in technique identification and 66.2% higher BertScore in appeal analysis compared to 1-shot GPT-4-Turbo. Moreover, PropaGaze complements limited human-annotated data in data-sparse and cross-domain scenarios, showing its potential for comprehensive and generalizable propaganda analysis.
UR - http://www.scopus.com/inward/record.url?scp=85218493819&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85218493819&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85218493819
T3 - Proceedings - International Conference on Computational Linguistics, COLING
SP - 5607
EP - 5628
BT - Main Conference
A2 - Rambow, Owen
A2 - Wanner, Leo
A2 - Apidianaki, Marianna
A2 - Al-Khalifa, Hend
A2 - Di Eugenio, Barbara
A2 - Schockaert, Steven
PB - Association for Computational Linguistics (ACL)
Y2 - 19 January 2025 through 24 January 2025
ER -