Do Programmers and AI See the Same Problem?

Quantifying Cognitive Misalignment in Code Generation

Authors

  • Yi Zhang Purdue University
  • Julia Rayz

DOI:

https://doi.org/10.32473/flairs.39.1.141770

Keywords:

Large Language Models, Code Generation, Bloom's Taxonomy

Abstract

The integration of AI assistants into software development raises fundamental questions about how task complexity is evaluated and the extent to which these evaluations align with human perception. Current evaluations focus primarily on functional correctness, overlooking this cognitive alignment. We introduce and empirically examine cognitive misalignment: the discrepancy between human and AI perceptions of a task's cognitive demands. Using Bloom’s Taxonomy, we prompt five LLMs to classify 2,520 tasks from three code generation benchmarks, and establish human reference annotations for 150 tasks via expert consensus. Results show systematic misalignment: humans predominantly classify tasks as "Apply" or "Analyze", whereas several LLMs overestimate the "Create" dimension. This gap varies by model and task type and may contribute to observed interaction frictions and productivity paradoxes. Our findings motivate the development of cognitively aware benchmarks and evaluation methods that better reflect human judgments of task complexity.

Downloads

Published

06-05-2026

How to Cite

Zhang, Y., & Rayz, J. (2026). Do Programmers and AI See the Same Problem? Quantifying Cognitive Misalignment in Code Generation. The International FLAIRS Conference Proceedings, 39(1). https://doi.org/10.32473/flairs.39.1.141770

Issue

Section

Special Track: Human-AI Collaboration and Augmented Intelligence