Improving LLM Thematic Analysis through Metric-Driven Self-Correction
DOI:
https://doi.org/10.32473/flairs.39.1.141807Keywords:
LLM thematic analysis, representativeness, coverage metrics, automated qualitative analysis, self-correctionAbstract
Large language models (LLMs) are increasingly used to perform thematic analysis of qualitative data, yet they systematically underrepresent minority viewpoints. We propose a self-correction framework in which representativeness metrics (coverage gap, subgroup disparity, and rank correlation) are computed after initial theme generation and fed back as structured critique. The main contribution of this paper is the framework itself, which makes the quality of correction measurable and auditable. In experiments on 90 product reviews across three categories, Gemini 2.5 Flash reduced average coverage gap from 80.4% to 18.6% over three iterations, but the framework's metrics revealed that this improvement came at a cost: over-correction degraded rank correlation and increased subgroup disparity. Replication with Gemini 3.1 Pro showed no such failures. Without systematic measurement, these trade-offs would have been invisible in both cases.Downloads
Published
06-05-2026
How to Cite
Im, T., & Park, H. (2026). Improving LLM Thematic Analysis through Metric-Driven Self-Correction. The International FLAIRS Conference Proceedings, 39(1). https://doi.org/10.32473/flairs.39.1.141807
Issue
Section
Posters
License
Copyright (c) 2026 Tacksoo Im, Hyesung Park

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.