Classifying Target Sentences for LLM-Generated Persuasion Attacks in Press Releases from Federal Research Agencies

Authors

  • Hsien-Te Kao Aptima, Inc.
  • Peter Bautista Aptima, Inc.
  • William Dupree Aptima, Inc.
  • Gabriel Ganberg Aptima, Inc.
  • Jeffrey M. Beaubien Aptima, Inc.
  • Laura Cassani Aptima, Inc.
  • Svitlana Volkova Aptima, Inc.

DOI:

https://doi.org/10.32473/flairs.39.1.141855

Keywords:

Target Classification, Attack Target, Persuasion Attack, Press Release, Research Agency

Abstract

Information campaigns increasingly use LLMs to generate persuasive competing narratives around federal research agency press releases. Prior work largely centers on post hoc assessment, emphasizing detectability, characterization, and susceptibility after persuasion attacks are observed. In this paper, we build sentence-level classifiers that label whether a sentence in a source press release is an attack target under 23 persuasion techniques and three generating LLMs, using 972 U.S. federal research agency press releases. We compare model performance across embedding features, NLP features, and combined feature sets. The task yields promising performance across techniques and models, with NLP features consistently outperforming embeddings, while combined feature sets can underperform NLP alone. Stable cues concentrate in syntactic form and information distribution, aligning attack targets with structurally salient sentences that carry explicit commitments. Anticipating attack targets enables proactive strategies for official communication.

Downloads

Published

06-05-2026

How to Cite

Kao, H.-T., Bautista, P., Dupree, W., Ganberg, G., Beaubien, J. M., Cassani, L., & Volkova, S. (2026). Classifying Target Sentences for LLM-Generated Persuasion Attacks in Press Releases from Federal Research Agencies. The International FLAIRS Conference Proceedings, 39(1). https://doi.org/10.32473/flairs.39.1.141855