13 Comments
User's avatar
Sammy Wright's avatar

Very true - but my current feeling is that we also have to be careful of the sense of precision that fine-grained marks can give. I actually really like using descriptors without grades because the absence of ranking forces attention to the qualities of the writing, rather than the score.

Toddy Ashton's avatar

This is excellent. A continuous curve of improvement that removes 'thresholding' effects would make for a much better system. I actually totally disagree with other comments we need less assessment. We just need better assessment that better informs teaching and wastes less time and creates less ambiguity. I would love to find a system that creates greater domain maps of competency. Even maths/English/reading is far too broad a spectrum from which to draw meaningful data and effective precise teaching.

John Bullock's avatar

> what both metrics have in common is that they replace a crude and distorting threshold system with a smooth and continuous metric.

About 15 years ago, a Yale faculty committee on grading proposed the replacement of letter grades with a numeric grading system. The proposal was not popular. When the faculty met to discuss it, one English professor called it "shameful." (I was on the faculty and at that meeting.)

Why did the Yale plan incur suspicion and resentment? A large part of the answer, I think, is that many people are innumerate or afraid of numbers. And I anticipate that most other efforts to replace "crude and distorting threshold systems" with smoother metrics will meet similar objections, for similar reasons.

MikeJ1000's avatar

You're overthinking it. Grades as discrete categories may not be perfect, but they are easy to understand. It's easy to spot the students who are WTS in Reading but EXS+ in Maths and Writing and most teachers will apply professional common sense when looking at individual students. An extra % layer on top of each grade/category just adds complexity - we're at peak data in student assessment, we need less not more.

Daisy Christodoulou's avatar

Is “this year 6 student is working at greater depth within the expected standard” intrinsically easier to understand than “this year 6 student has a writing age of 13?”

MikeJ1000's avatar

Fair point. For me I think it is, but I'm sure for one student you could argue it either way. But with a class of 30 Y6 students, how does it help the teacher to know there are kids with reading ages of 13, 12, 11, 10, 9, 4, 5, 6 etc? What do you do with that extra information? A simple system that makes it easy to spot the kids who are below the expected standard (for their age) may be good enough.

Daisy Christodoulou's avatar

My argument is that it does not make it easy to spot the kids who are below the expected standard. It guarantees a large class of "Pauls" - students who are struggling but who the system never flags up.

Health visitors do have target threshold weights for babies, but they still measure the actual weight and record it. They could just record the weight of babies as "below / medium / above" but they don't because they recognise that the extra detail is useful - particularly when it comes to measuring progress!

Englishman in Switzerland's avatar

An interesting line of research would be to see if year 6 pupils, and their parents, understand this new system. Some example probability grades of imaginary pupils could be shown to year 6 pupils and then they could be asked to analyse them

Tara Houle's avatar

Here in British Columbia Canada, all areas of student assessment have been reduced to a scale, which indicates a student is either Emerging, Developing, Proficient & Extending. We don't have letter grades, or percentages until 2 years before kids graduate. It's been a massive fail, for students, teachers and parents, with the Opposition in government threatening to toss the current curriculum and assessment methodology as soon as they get to power, which will be sooner rather than later. What letter grades do, is offer a quick snapshot about where the student currently is. I don't know what that has to be replaced with anything. It's easy for kids to understand, as well as parents. Nothing is perfect, but i believe that complicating this and creating something more involving is rather taxing and not necessarily better. Kids don't need a whole bunch of why they received a mark and what needs to be better, rather than just knowing what the mark is. Furthermore any post secondary institution, as well as employment, still uses grades to determine where a student or employee is and where they need to be. I'm all for more ongoing assessment, but we need to remember how kids think, and what they want. Ditto for parents. They don't have time to understand about graphs or big words...they just need a snapshot to see what's going on so they can get on with making supper. They instinctively just need a quick score to figure out what is what, rather than an in depth graph they have to try and decipher. Appreciate the column.

Gavin Roddy's avatar

What a fantastic article. Thank for writing this (I also loved the Beatles reference). I have been doing a lot of thinking about grading (particularly how they do not necessarily communicate learning or growth). It is especially frustrating when performance on standardized tests are used to evaluate students and schools. Thank you so much for writing this.

Adam Boxer's avatar

Hi Daisy, thank you for writing this - I really enjoyed it. I have a couple of questions:

1. Is this about *all grades* or the primary three grade system? I.e., do you apply the same logic to GCSEs and A-Levels?

2. In 2013, Ofqual did a survey on the use of scaled scores with a probability estimate as a potential replacement for grades. The research survey came back with the overwhelming majority of respondents saying that we should keep grades (https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/529378/2013-11-01_01-yougov-analysis-of-responses-to-the-2013-gcse-reform-consultation.pdf). What do you suspect the reason for that was, and do you think anything has changed in that intervening time?

Dennis Sherwood's avatar

"The grade probability is a measurement of certainty". Really? I thought that a 51% probability means "in 100 instances, 51 of them turn out like this, and 49 don't".

As an alternative, suppose the assessment doesn't award a grade, but the raw mark, plus or minus a defined confidence interval - for example 58 ± 5, where the confidence interval of ±5 is defined such that, say, there is a 99% probability that a fair re-mark will be between 53 and 63, and 1% probability that it won't.

Given that the likelihood that a re-mark will be different from the original mark is built in, the challenge process can be redefined such that any re-mark between 53 and 63 confirms the original assessment; by other re-mark - say, 65 - results in a new certificate showing 65 ± 5.

If the confidence interval is set wisely, the number of new assessments resulting from a challenge will be very small - in great contrast to the current situation in which 24% (that's nearly 1 in 4) of challenges results in a grade change.

This is especially important as regards the current 10 GCSE and 6 A level grades.

Alex's avatar

Love the probability distribution, but suspect that schools will hate it. Hope I'm wrong!

Using three sig figs for the probability seems a bit much though. How certain are we that it's 51.5% and not 52.1% ?