Improving Resilience Against Cyber-attacks via Reward-Shaped Reinforcement Learning in a Network Defense Game

Authors

  • Prithviraj Dasgupta U. S. Naval Research Laboratory

DOI:

https://doi.org/10.32473/flairs.39.1.141839

Keywords:

cyber-defense, deep reinforcement learning, reward shaping

Abstract

Artificial intelligence tools are being increasingly used by cyber-attackers to craft sophisticated attacks that can expose vulnerabilities and establish backdoors on enterprise networks. To respond to such smart attackers, cyber-defense mechanisms need to be dynamic and agile by precisely predicting attack locations in the network and rapidly removing any attacker artifacts. To address this problem, reinforcement learning (RL) techniques have been demonstrated as a successful means for devising effective cyber-defense techniques via penetration testing. However, a limitation of such RL techniques is the increasing latency in learning a defender policy against dynamically changing attack strategies. In this paper, we explore reward shaping techniques within RL as a means to improve the learning times for defender policies. We show that periodically injecting real-time network information such as node importance and network compromise state via a shaped reward functions into the RL algorithm can  accelerate the defender’s learning time. We report experimental results on different topologies and configurations of a simulated enterprise network and show that our proposed approach can significantly improve learning times and effectiveness for the defender policies.

Downloads

Published

06-05-2026

How to Cite

Dasgupta, P. (2026). Improving Resilience Against Cyber-attacks via Reward-Shaped Reinforcement Learning in a Network Defense Game. The International FLAIRS Conference Proceedings, 39(1). https://doi.org/10.32473/flairs.39.1.141839

Issue

Section

Special Track: Security, Privacy and Ethics in AI