Knowledge-Augmented Large Language Models for Automated Characterization of Cybersecurity Vulnerabilities
DOI:
https://doi.org/10.32473/flairs.39.1.141803Abstract
The US National Vulnerability Database (NVD) is a public repository of software and hardware vulnerabilities maintained by the National Institute of Standards and Technology, which also introduced the Vulnerability Description Ontology (VDO) to standardize vulnerability characterization. Despite advances in secure software development and techniques for detecting vulnerabilities, the number of
software vulnerabilities registered in the NVD continue to increase, making accurate characterization essential for selecting effective defense strategies and reducing cyber risk. However, manual labeling is costly and time-consuming, and traditional machine learning approaches often require large labeled datasets. This paper proposes an LLM-driven framework for Common Vulnerabilities and Exposures (CVEs) characterization guided by VDO. The framework includes two agents: (1) a Context Enrichment Agent that augments sparse descriptions of CVEs with relevant
technical information from external sources, and (2) an Ontology Guided Characterization Agent that performs structured multi-label classification using VDO definitions and N-shot prompting. This design addresses the limited details in the official CVE text, the complexity and imbalance of VDO labels, and generalization to newly disclosed vulnerabilities. We evaluated the framework on a VDO labeled benchmark dataset and on a newly created dataset of 125 recently disclosed CVEs from 2024 to 2025 labeled by our research project team. Experiments with GPT 4o, Gemini 2.5 Flash, and Llama 3.1 405B show consistent gains from context enrichment and N-shot prompting. GPT 4o achieves macro F1 scores up to 0.81, 0.91, 0.90, 0.87, and 0.83 on the benchmark for Context, Impact Method, Attack Theater, Logical Impact, and Mitigation, respectively, and reaches up to 0.95 macro F1 for Impact Method on the 2024 to 2025 dataset. Our source code is available: https://github.com/ayeshasdina/KA-LLM-CVE.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Elijah Needham, Denis Ulybyshev, Ayesha Dina

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.