Can ChatGPT provide useful feedback?

In our last post, we looked at the way ChatGPT’s written feedback is superficially impressive but not that helpful pedagogically. In this…

Jan 28, 2023

In our last post, we looked at the way ChatGPT’s written feedback is superficially impressive but not that helpful pedagogically. In this post, we’ll look at ways it can provide something more useful.

Part of the challenge here is that providing good feedback on writing is hard for humans to do, let alone AI. Ideally, you want something that a student has to act on, not something that they just passively read. So you want to produce something that is fairly specific.

But you don’t want to produce something that is too specific. The aim, as Dylan Wiliam says, is to improve the student, not the work. This is why I sometimes get uneasy about whole-class feedback that essentially turns into editing or redrafting the work. You don’t really want to improve the essay; you want to improve the students’ thinking so that their next essay doesn’t have the same mistakes.

So how does ChatGPT fare with this challenge? The rest of this blog post will discuss a student response to the following GCSE question: ‘It is the people who have extraordinary skill, courage and determination who deserve to be famous, not those who have good looks or lots of money or behave badly.’ Write a letter to the editor of a newspaper in which you argue your point of view in response to this statement.

We saw in the previous blog that if you ask it to grade an essay and explain why, it provides a fairly generic paragraph that is based on the mark scheme.

Technically impressive, but not actually that useful.

What about if you ask it how the writer could improve the essay?

This is definitely better, but it falls into the trap I talked about above, of being too specific and of seeking to improve the work, not the student. If this student went off and did more research on Charli Damelio and Kim Kardashian it is unlikely that would help them in their next essay.

So what happens when I change the prompt and ask it to focus on the student, not the essay? I think it comes up with something pretty good!

It thinks run-on sentences and comma splices are important! It brought a tear to my eye! Who isn’t going to love an AI that agrees with them!

But I am still not completely happy. Yes, I agree that the writer should practice writing short concise and well-structured sentences — but that’s still too vague for me! I want actual examples of what the student should do. So I tried a prompt that asked for an actual activity.

This is so close to being great, but ultimately it is wrong. None of the options are correct, and the option it says is correct has actually just replaced the run-on with a comma splice. The correct option should be: Take Kim Kardashian as an example. She didn’t deserve her fame at first, but she grew and learnt.

Still, it is capable in principle of providing specific and actionable feedback that will help a student to improve their thinking. And it would be very easy to edit that question so that it was correct — far easier than writing it out from scratch.

Generally speaking, I find that it is a bit hit and miss when constructing multiple-choice questions. Sometimes it provides questions where all the options are wrong. Sometimes it provides ones where it says there is one right answer but actually there’s more than one right answer. Sometimes it just comes up with gobbledegook. In practise I have found it works best with the work of a) younger students or b) older students who have made relatively few errors. It is weakest at providing feedback for older students who have made a lot of errors, as these errors tend to be more complex and harder to unpick.

However, I am inclined to cut it a little slack because I am asking it to do hard tasks that would be incredibly time-consuming for a teacher, and it is still producing some quite useful outputs. You can also follow up by saying ‘please give me four more questions like this one’. Not even the most Stakhanovite senior manager would expect their teachers to provide five personalised multiple choice questions in response to every piece of writing.

Obviously, in the examples above I have asked lots of follow-up questions. This isn’t particularly efficient. What we really want is to be able to ask one question that will generate everything we want, and ideally for it to be easily editable too so we can correct any errors that creep in. We have built a system like this and integrated it with our No More Marking Comparative Judgement dashboard.

Your students can type their responses into our system, and you can choose what percentage of the judging you would like completed by humans, and what percentage by AI. Once all the judging is complete, each student will get a grade and a specific comment. You can easily review all the scripts and edit the comments. You can also print off the scripts with the comments so the students have a record of them. You could still do whole-class feedback the next lesson using some screenshots of the scripts, and you could potentially use ChatGPT separately to create some good MCQs, if you wanted. And depending on how much judging you chose to do, you’d still be saving time compared to any kind of traditional marking.

Here are some screenshots of how our system works.

Judging dashboard, including human & AI judges

The feedback page — the student essay is at the bottom & editable AI comments on right

The results PDF — it combines all the essays with their score & edited AI feedback

We’ll go through how our system works with subscribers in our webinar next Wednesday at 4pm. Sign up here.

No More Marking

Can ChatGPT provide useful feedback?

In our last post, we looked at the way ChatGPT’s written feedback is superficially impressive but not that helpful pedagogically. In this…

Discussion about this post