17 Comments

What prompts did you give chatGPT to mark with? Evaluating the prompt you used is fundamental to getting GPT to do anything for you, especially something as complex as marking. I'm little surprised you saw worse performance with GPT-4 being that every piece of research I've seen indicates it performs better than 3.5 in pretty much every task.

Expand full comment
author

AQA are researching whether it is possible to mark short answer STEM questions with AI, but last I heard without much success.

Expand full comment

Great, insightful article. I wonder whether you've looked at, or have plans to look at using GPT-4 to mark STEM subjects?

Expand full comment

I just want to say thanks for this valuable research. Love you real pupil real classroom guys.

Expand full comment

Have I misunderstood the test? Wouldn’t a better test of ChatGPT v No More Marking be to make it do comparative judgement with all the scripts, and rank them accordingly, rather than use a mark scheme? Isn’t the point of No More Marking that mark schemes are ambiguous?

Expand full comment