Addressing a Bias in Evaluating of Student Self-Explanations of Worked Programming Examples
DOI:
https://doi.org/10.32473/flairs.39.1.141405Keywords:
Large Language Models, Biases, Programming Explanations, Code Comprehension, Automated EvaluationAbstract
Worked examples are step-by-step solutions to problems in a specific domain, offered to students to acquire domain-specific problem-solving skills. The power of worked examples could be magnified by combining them with self-explanations, which ask students to explain rather than passively study each problem-solving step. The main challenge of this approach is assessing the correctness of the student's explanations. In the current approach, student explanations are judged by their semantic similarity to an explanation provided by an instructor or domain expert. However, recent studies of example explanations in the domain of programming demonstrated that many students express themselves very differently from domain experts. In this situation, a traditional semantic similarity approach might introduce bias against students who correctly explain worked examples but are considerably different from expert explanations. In this paper, we use a recently published dataset to compare several explanation-assessment approaches based on semantic similarity with alternative approaches based on direct Large Language Model (LLM) prompting. Our results show that the use of LLMs enables worked example systems that follow an active learning approach to reduce bias in evaluating example explanations.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Arun Balajiee Lekshmi Narayanan, Dr. Xiang Lorraine Li, Dr. Peter Brusilovsky

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.