Improving LLM Thematic Analysis through Metric-Driven Self-Correction

Authors

  • Tacksoo Im Georgia Gwinnett College
  • Hyesung Park

DOI:

https://doi.org/10.32473/flairs.39.1.141807

Keywords:

LLM thematic analysis, representativeness, coverage metrics, automated qualitative analysis, self-correction

Abstract

Large language models (LLMs) are increasingly used to perform thematic analysis of qualitative data, yet they systematically underrepresent minority viewpoints. We propose a self-correction framework in which representativeness metrics (coverage gap, subgroup disparity, and rank correlation) are computed after initial theme generation and fed back as structured critique. The main contribution of this paper is the framework itself, which makes the quality of correction measurable and auditable. In experiments on 90 product reviews across three categories, Gemini 2.5 Flash reduced average coverage gap from 80.4% to 18.6% over three iterations, but the framework's metrics revealed that this improvement came at a cost: over-correction degraded rank correlation and increased subgroup disparity. Replication with Gemini 3.1 Pro showed no such failures. Without systematic measurement, these trade-offs would have been invisible in both cases.  

Downloads

Published

06-05-2026

How to Cite

Im, T., & Park, H. (2026). Improving LLM Thematic Analysis through Metric-Driven Self-Correction. The International FLAIRS Conference Proceedings, 39(1). https://doi.org/10.32473/flairs.39.1.141807