An Exploratory Study of Agentic Retrieval Augmented Generation for Mental Health Oriented Language Models

Authors

  • Khoa Pham Mississippi State University
  • Jiacheng Li The University of Alabama
  • Hassan S. Al Khatib The University of Alabama
  • Shahram Rahimi The University of Alabama
  • Noorbakhsh Amiri Golilarz The University of Alabama
  • Andy Perkins Mississippi State University

DOI:

https://doi.org/10.32473/flairs.39.1.141782

Keywords:

Agentic AI, Agentic Context Engineering (ACE), Agentic RAG, Mental Health, Mental Health LLMs, Retrieval Augmented Generation (RAG)

Abstract

Mental health conditions affect over one billion individuals globally and remain challenging to assess accurately due to fragmented clinical data and subjective evaluation methods. Mental health support systems increasingly rely on large language models (LLMs) due to their capabilities in natural language understanding and response generation. While retrieval augmented generation (RAG) and agentic frameworks have improved grounded generation in several domains, there is limited understanding of how such approaches affect response quality in mental health related tasks. In particular, the impact of structured context management and autonomous refinement on clinical relevance, empathy, completeness, and safety remains underexplored. In this study, we investigate the effects of agentic RAG on the performance of multiple mental health oriented language models. We adopt a common pipeline configuration that integrates patient dialogue, structured patient history, and externally retrieved clinical knowledge. The pipeline consists of coordinated stages for patient context retrieval, context augmentation, and response generation with autonomous evaluation and iterative refinement. We conduct empirical evaluations across four mental health models under this pipeline and analyze their performance in terms of medical accuracy, empathy, completeness, safety, and overall response quality. Our results show consistent trends toward improved responses when structured context handling and agentic refinement are applied, indicating that these components influence model behavior independent of architecture. This work provides insight into how agentic RAG influences model outputs in mental health applications and highlights the importance of context engineering and quality control in LLM based support systems. These findings indicate that Agentic Context Engineering (ACE) may contribute to improved reasoning depth, contextual alignment, and patient centered response quality across diverse models. However, despite the improvements observed, the framework remains an early step toward more reliable AI assisted mental health assessment. Continued research is needed to refine model architectures, optimize prompt engineering, and expand evaluation across broader and more diverse clinical contexts to ensure safety, consistency, and real world applicability.

Downloads

Published

06-05-2026

How to Cite

Pham, K., Li, J., S. Al Khatib, H., Rahimi, S., Amiri Golilarz, N., & Perkins, A. (2026). An Exploratory Study of Agentic Retrieval Augmented Generation for Mental Health Oriented Language Models. The International FLAIRS Conference Proceedings, 39(1). https://doi.org/10.32473/flairs.39.1.141782

Issue

Section

Special Track: AI in Healthcare Informatics