MineObserver: A Deep Learning Framework for Assessing Natural Language Descriptions of Minecraft Imagery
DOI:
https://doi.org/10.32473/flairs.v35i.130729Keywords:
Computer Vision, Natural Language Processing, Pedagogical AgentAbstract
This paper introduces a novel approach for learning natural language descriptions of scenery in Minecraft. We apply techniques from Computer Vision and Natural Language Processing to create an AI framework called MineObserver for assessing the accuracy of learner-generated descriptions of science-related images. The ultimate purpose of the system is to automatically assess the accuracy of learner observations, written in natural language, made during science learning activities that take place in Minecraft. Eventually, MineObserver will be used as part of a pedagogical agent framework for providing in-game support for learning. Preliminary results are mixed, but promising with approximately 62% of images in our test set being properly classified by our image captioning approach. Broadly, our work suggests that computer vision techniques work as expected in Minecraft and can serve as a basis for assessing learner observations.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Jay Mahajan, Samuel Hum, Jeff Ginger, H. Chad Lane
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.