Context is Key: Aligning Large Language Models with Human Moral Judgments through Retrieval-Augmented Generation

Authors

  • Matthew Boraske West Chester University of Pennsylvania
  • Richard Burns West Chester University of Pennsylvania

DOI:

https://doi.org/10.32473/flairs.38.1.138947

Keywords:

Large Language Models, Applied Natural Language Processing, social media analysis, Retrieval Augmented Generation, Case-Based Reasoning, GPT

Abstract

In this paper, we investigate whether pre-trained large language models (LLMs) can align with human moral judgments on a dataset of approximately fifty thousand interpersonal conflicts from the AITA (Am I the A******) subreddit, an online forum where users evaluate the morality of others. We introduce a retrieval-augmented generation (RAG) approach that uses pre-trained LLMs as core components. After collecting conflict posts from AITA and embedding them in a vector database, the RAG agent retrieves the most relevant posts for each new query. Then, these are used sequentially as context to gradually refine the LLM's judgment, providing adaptability without having to undergo costly fine-tuning. Using OpenAI's GPT-4o, our agent outperforms directly prompting the LLM while achieving 83\% accuracy and a Matthews correlation coefficient of 0.469 while also reducing the rate of toxic responses from 22.53\% to virtually zero. These findings indicate that the integration of LLMs into RAG agents is an effective method to improve their alignment with human moral judgments while mitigating toxic language.

Downloads

Published

14-05-2025

How to Cite

Boraske, M., & Burns, R. (2025). Context is Key: Aligning Large Language Models with Human Moral Judgments through Retrieval-Augmented Generation. The International FLAIRS Conference Proceedings, 38(1). https://doi.org/10.32473/flairs.38.1.138947

Issue

Section

Special Track: Applied Natural Language Processing