No More Marking

No More Marking

Share this post

No More Marking
No More Marking
How good is ChatGPT at writing essays? Some data!
User's avatar
Discover more from No More Marking
Education, assessment and technology by Daisy Christodoulou & Dr Chris Wheadon
Over 9,000 subscribers
Already have an account? Sign in

How good is ChatGPT at writing essays? Some data!

How did ChatGPT-produced writing do when compared to writing by real 9 year olds?

Daisy Christodoulou's avatar
Daisy Christodoulou
Feb 09, 2023

Share this post

No More Marking
No More Marking
How good is ChatGPT at writing essays? Some data!
Share

Since ChatGPT was launched last year, there have been a number of claims about how good it is at writing essays. Can it produce top grade 9 GCSE essays? A* A-level essays? Grade 5 or Grade 6 GCSE essays? My own claim is that for pure writing assessments (the ‘Question 5’ type ones on the AQA English Language GCSE paper), it is very good. However, these are all just claims. Is there any way we can get on data on this question?

We were already running a large-scale Year 4 writing assessment of about 50,000 pupils from about 1,100 primary schools. The Year 4s had to write about whether they thought mobile phones should be banned for under-13s. We created an extra dummy ‘ChatGPT’ school featuring 8 essays that were all written by ChatGPT.

We did not design them all to be perfect. We mixed up the prompt a couple of times, and we included a couple that were deliberately humorous!

In our previous blog, we showed that teachers could not reliably tell the difference between the ChatGPT scripts and the human ones. In this blog, we’ll share the actual results of the 8 ChatGPT scripts.

Thank you for reading No More Marking. This post is public so feel free to share it.

Share

In all of our assessments, we select a set of moderation scripts — about 20–30% of pupil scripts which are judged by teachers in other schools, not just teachers in their own school. For this assessment, the moderation sample consisted of 14,083 scripts. How did the ChatGPT scripts do compared to this sample? Here is a table showing their results.

So, we can see that for the four essays where we kept the prompt the same as the one the students got, the scores were all in the top 7%. The top two essays scored in the top percentile and were within the margin of error of the top script. The only three to finish outside the top 7% were ones where we had changed the prompt to be unusual or to include errors. Interestingly, the essay where we asked ChatGPT to write the article in the form of a song also scored very highly! You may remember from our previous blog that the essay in the style of Harry Potter was the only one that was flagged by our teachers as being ChatGPT-authored.

These pieces of writing were assessed with Comparative Judgement, which is an assessment technique that rewards the overall holistic quality of a piece of writing, not just the accuracy. For ChatGPT to do well on an assessment like this, being accurate is not enough. It has to have some element of style and originality too.

So what does a top-percentile impossible-to-recognise robot essay look like? Here you are!

Essentially, ChatGPT reached the ceiling of scores on this task, so we have not assessed its limits. To do so, we’d have to carry out a similar project with the essays of older pupils. Stay tuned!

Thanks for reading No More Marking! Subscribe for free to receive new posts and support my work.

Share this post

No More Marking
No More Marking
How good is ChatGPT at writing essays? Some data!
Share

Discussion about this post

User's avatar
Skills vs knowledge, 13 years on
What can we learn from widespread dissatisfaction with the Curriculum for Excellence?
Nov 13, 2023 â€¢ 
Daisy Christodoulou
69

Share this post

No More Marking
No More Marking
Skills vs knowledge, 13 years on
7
The fall of Eng Lit
What is the cause, is it a problem - and if so, what can we do about it?
Nov 16, 2024 â€¢ 
Daisy Christodoulou
108

Share this post

No More Marking
No More Marking
The fall of Eng Lit
20
So, can AI assess writing?
Results of our big new Comparative Judgement AI trial
Mar 31 â€¢ 
Daisy Christodoulou
 and 
Chris Wheadon
37

Share this post

No More Marking
No More Marking
So, can AI assess writing?
13

Ready for more?

© 2025 No More Marking Ltd
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Create your profile

User's avatar

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.