The Robot Maze Test

An Evaluation of Situated Learning for Humans and Machine Agents

Authors

  • Ian Perera IHMC
  • Brady DeCouto Florida State University
  • Christopher J. Bates IHMC
  • Matthew Johnson IHMC
  • Charles B. Patterson IHMC
  • Alec Treacy Florida State University
  • Zoeanne McCurdy IHMC

DOI:

https://doi.org/10.32473/flairs.39.1.141850

Keywords:

LLM, human ai agreement, AI education, Human-AI Collaboration, Human-Machine Collaboration, Automated Evaluation, model trust, Cognitive ability tests

Abstract

With the burgeoning popularity of Large Language Models (LLMs) and their introduction to the workplace in multiple fields, an important question remains unexplored: what are the cognitive skills and attributes that make an individual well-suited to interact with such black-box systems? To answer this, we developed a simulated robot planning task testing an individual’s ability to infer how a novel environment influences a robot’s behavior through interactions and experimentation. Our platform revealed that users with greater system knowledge at the end of the task typically used slower, exploratory interactions and testing of hypotheses. We then extended this platform to include a code-generation LLM model to serve as a collaborative learning agent which updates a model of robot interactions through a combination of exploration and natural language guidance. We believe this framework and collected data provides an opportunity to study human-LLM situated model building, error correction performance, and alignment of learning behaviors in new environments.

Downloads

Published

06-05-2026

How to Cite

Perera, I., DeCouto, B., Bates, C., Johnson, M., Patterson, C., Treacy, A., & McCurdy, Z. (2026). The Robot Maze Test: An Evaluation of Situated Learning for Humans and Machine Agents. The International FLAIRS Conference Proceedings, 39(1). https://doi.org/10.32473/flairs.39.1.141850

Issue

Section

Special Track: Human-AI Collaboration and Augmented Intelligence