Classifying Target Sentences for LLM-Generated Persuasion Attacks in Press Releases from Federal Research Agencies
DOI:
https://doi.org/10.32473/flairs.39.1.141855Keywords:
Target Classification, Attack Target, Persuasion Attack, Press Release, Research AgencyAbstract
Information campaigns increasingly use LLMs to generate persuasive competing narratives around federal research agency press releases. Prior work largely centers on post hoc assessment, emphasizing detectability, characterization, and susceptibility after persuasion attacks are observed. In this paper, we build sentence-level classifiers that label whether a sentence in a source press release is an attack target under 23 persuasion techniques and three generating LLMs, using 972 U.S. federal research agency press releases. We compare model performance across embedding features, NLP features, and combined feature sets. The task yields promising performance across techniques and models, with NLP features consistently outperforming embeddings, while combined feature sets can underperform NLP alone. Stable cues concentrate in syntactic form and information distribution, aligning attack targets with structurally salient sentences that carry explicit commitments. Anticipating attack targets enables proactive strategies for official communication.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Hsien-Te Kao, Peter Bautista, William Dupree, Gabriel Ganberg, Jeffrey M. Beaubien, Laura Cassani, Svitlana Volkova

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.