Knowledge-Augmented Large Language Models for Automated Characterization of Cybersecurity Vulnerabilities

Authors

  • Elijah Needham Florida Polytechnic University
  • Denis Ulybyshev Florida Polytechnic University
  • Ayesha Dina Florida Polytechnic University

DOI:

https://doi.org/10.32473/flairs.39.1.141803

Abstract

The US National Vulnerability Database (NVD) is a public repository of software and hardware vulnerabilities maintained by the National Institute of Standards and Technology, which also introduced the Vulnerability Description Ontology (VDO) to standardize vulnerability characterization. Despite advances in secure software development and techniques for detecting vulnerabilities, the number of
software vulnerabilities registered in the NVD continue to increase, making accurate characterization essential for selecting effective defense strategies and reducing cyber risk. However, manual labeling is costly and time-consuming, and traditional machine learning approaches often require large labeled datasets. This paper proposes an LLM-driven framework for Common Vulnerabilities and Exposures (CVEs) characterization guided by VDO. The framework includes two agents: (1) a Context Enrichment Agent that augments sparse descriptions of CVEs with relevant
technical information from external sources, and (2) an Ontology Guided Characterization Agent that performs structured multi-label classification using VDO definitions and N-shot prompting. This design addresses the limited details in the official CVE text, the complexity and imbalance of VDO labels, and generalization to newly disclosed vulnerabilities. We evaluated the framework on a VDO labeled benchmark dataset and on a newly created dataset of 125 recently disclosed CVEs from 2024 to 2025 labeled by our research project team. Experiments with GPT 4o, Gemini 2.5 Flash, and Llama 3.1 405B show consistent gains from context enrichment and N-shot prompting. GPT 4o achieves macro F1 scores up to 0.81, 0.91, 0.90, 0.87, and 0.83 on the benchmark for Context, Impact Method, Attack Theater, Logical Impact, and Mitigation, respectively, and reaches up to 0.95 macro F1 for Impact Method on the 2024 to 2025 dataset. Our source code is available: https://github.com/ayeshasdina/KA-LLM-CVE.

Downloads

Published

06-05-2026

How to Cite

Needham, E., Ulybyshev, D., & Dina, A. (2026). Knowledge-Augmented Large Language Models for Automated Characterization of Cybersecurity Vulnerabilities. The International FLAIRS Conference Proceedings, 39(1). https://doi.org/10.32473/flairs.39.1.141803