Performance of freely available vision-capable chatbots on the test for understanding graphs in kinematics Documents

Main Document

Performance of freely available vision-capable chatbots on the test for understanding graphs in kinematics 

written by Giulia Polverini and Bor Gregorcic

In this paper, we evaluate the performance of three freely available vision-capable chatbots – Copilot, Gemini, and Claude 3 Sonnet – on the Test of Understanding Graphs in Kinematics (TUG-K). Our analysis highlights a performance gap between these freely available chatbots and the state-of-the-art, subscription-based ChatGPT-4. We also report largely unclear patterns of performance of the tested chatbots on different types of tasks. We discuss the implications of our findings for using chatbots in educational contexts, point out potential challenges for educational equity, and provide some ideas for future research that could help us better understand the patterns in the chatbots' performance on tasks that involve the interpretation of graphical input.

Last Modified September 6, 2024

This file is included in the full-text index.