Detail Page

Achieving Human Level Partial Credit Grading of Written Responses to Physics Conceptual Question using GPT-3.5 with Only Prompt Engineering
written by Zhongzhou Chen and Tong Wan
Large language modules (LLMs) have great potential for auto-grading student written responses to physics problems due to their capacity to process and generate natural language. In this explorative study, we use a prompt engineering technique, which we name "scaffolded chain of thought (COT)", to instruct GPT-3.5 to grade student written responses to a physics conceptual question. Compared to common COT prompting, scaffolded COT prompts GPT-3.5 to explicitly compare student responses to a detailed, well-explained rubric before generating the grading outcome. We show that when compared to human raters, the grading accuracy of GPT-3.5 using scaffolded COT is 20% - 30% higher than conventional COT. The level of agreement between AI and human raters can reach 70% - 80%, comparable to the level between two human raters. This shows promise that an LLM-based AI grader can achieve human-level grading accuracy on a physics conceptual problem using prompt engineering techniques alone.
Physics Education Research Conference 2024
Part of the PER Conference series
Boston, MA: July 10-11, 2024
Pages 97-101
Subjects Levels Resource Types
Education Foundations
- Assessment
= Conceptual Assessment
= Methods
Education Practices
- Technology
= Computers
- Lower Undergraduate
- Reference Material
= Research study
Intended Users Formats Ratings
- Researchers
- application/pdf
  • Currently 0.0/5

Want to rate this material?
Login here!


Mirror:
https://doi.org/10.1119/perc.2024…
Access Rights:
Free access
License:
This material is released under a Creative Commons Attribution 4.0 license. Further distribution of this work must maintain attribution to the published article's author(s), title, proceedings citation, and DOI.
Rights Holder:
American Association of Physics Teachers
DOI:
10.1119/perc.2024.pr.Chen
Keyword:
PERC 2024
Record Creator:
Metadata instance created September 6, 2024 by Lyle Barbato
Record Updated:
September 12, 2024 by Lyle Barbato
Last Update
when Cataloged:
September 12, 2024
Other Collections:

ComPADRE is beta testing Citation Styles!

Record Link
AIP Format
Z. Chen and T. Wan, , presented at the Physics Education Research Conference 2024, Boston, MA, 2024, WWW Document, (https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16878&DocID=5945).
AJP/PRST-PER
Z. Chen and T. Wan, Achieving Human Level Partial Credit Grading of Written Responses to Physics Conceptual Question using GPT-3.5 with Only Prompt Engineering, presented at the Physics Education Research Conference 2024, Boston, MA, 2024, <https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16878&DocID=5945>.
APA Format
Chen, Z., & Wan, T. (2024, July 10-11). Achieving Human Level Partial Credit Grading of Written Responses to Physics Conceptual Question using GPT-3.5 with Only Prompt Engineering. Paper presented at Physics Education Research Conference 2024, Boston, MA. Retrieved May 1, 2025, from https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16878&DocID=5945
Chicago Format
Chen, Zhongzhou, and Tong Wan. "Achieving Human Level Partial Credit Grading of Written Responses to Physics Conceptual Question using GPT-3.5 with Only Prompt Engineering." Paper presented at the Physics Education Research Conference 2024, Boston, MA, July 10-11, 2024. https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16878&DocID=5945 (accessed 1 May 2025).
MLA Format
Chen, Zhongzhou, and Tong Wan. "Achieving Human Level Partial Credit Grading of Written Responses to Physics Conceptual Question using GPT-3.5 with Only Prompt Engineering." Physics Education Research Conference 2024. Boston, MA: 2024. 97-101 of PER Conference. 1 May 2025 <https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16878&DocID=5945>.
BibTeX Export Format
@inproceedings{ Author = "Zhongzhou Chen and Tong Wan", Title = {Achieving Human Level Partial Credit Grading of Written Responses to Physics Conceptual Question using GPT-3.5 with Only Prompt Engineering}, BookTitle = {Physics Education Research Conference 2024}, Pages = {97-101}, Address = {Boston, MA}, Series = {PER Conference}, Month = {July 10-11}, Year = {2024} }
Refer Export Format

%A Zhongzhou Chen %A Tong Wan %T Achieving Human Level Partial Credit Grading of Written Responses to Physics Conceptual Question using GPT-3.5 with Only Prompt Engineering %S PER Conference %D July 10-11 2024 %P 97-101 %C Boston, MA %U https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16878&DocID=5945 %O Physics Education Research Conference 2024 %O July 10-11 %O application/pdf

EndNote Export Format

%0 Conference Proceedings %A Chen, Zhongzhou %A Wan, Tong %D July 10-11 2024 %T Achieving Human Level Partial Credit Grading of Written Responses to Physics Conceptual Question using GPT-3.5 with Only Prompt Engineering %B Physics Education Research Conference 2024 %C Boston, MA %P 97-101 %S PER Conference %8 July 10-11 %U https://www.compadre.org/Repository/document/ServeFile.cfm?ID=16878&DocID=5945


Disclaimer: ComPADRE offers citation styles as a guide only. We cannot offer interpretations about citations as this is an automated procedure. Please refer to the style manuals in the Citation Source Information area for clarifications.

Citation Source Information

The AIP Style presented is based on information from the AIP Style Manual.

The APA Style presented is based on information from APA Style.org: Electronic References.

The Chicago Style presented is based on information from Examples of Chicago-Style Documentation.

The MLA Style presented is based on information from the MLA FAQ.

Achieving Human Level Partial Credit Grading of Written Responses to Physics Conceptual Question using GPT-3.5 with Only Prompt Engineering:


Know of another related resource? Login to relate this resource to it.
Save to my folders

Contribute

Related Materials

Similar Materials