What’s the one thing that needs changing about AI governance?

Our evidence to a UK Parliamentary Select Committee on the use of AI in schools

Apr 03, 2023

Last week I gave evidence to the Science & Technology Parliamentary Select Committee about the use of AI in schools.

The hearing took place on the same day the government published a white paper on AI governance, and one of the first questions was what one thing would you change about AI governance?

My answer – not that I have much expectation of it being fulfilled – is for greater transparency around the training sets used to create large language models like Open AI’s ChatGPT. This would help organisations like ours and many others integrate GPT into their systems and provide more certainty about what they can and cannot do.

At the moment, ChatGPT is capable of some very impressive achievements and also some colossal errors. Many people have listed these and I discussed one in particular that frustrated me - the way that when you asked it to solve a Pythagorean triple, it would confuse 13 squared and 14 squared. Is this because 13 squared is 169, 14 squared is 196 and the reversed numbers confuse it? I don’t know. You can see this example here and you can also hear me discussing this example on BBC Radio 4’s Yesterday in Parliament here at 15 minutes.

Thank you for reading No More Marking. This post is public so feel free to share it.

At the moment, its advocates claim that ChatGPT is not perfect but that it displays ‘patches of understanding’ that will increase in new models. Its detractors claim these patches of understanding are flukes – nothing more than stopped clocks that are right twice a day. If we had more of an idea about the data ChatGPT was trained on, we could get more of an idea about which claim is true. We could also use it more effectively. At the moment, everyone using ChatGPT is stuck with a rather frustrating trial and error process where they try out different approaches to see if they work but have very little certainty about how ChatGPT arrives at answers and if it is capable of producing such answers reliably.

Can AI help education?

I spoke alongside Professor Rose Luckin. She quite rightly pointed out that many different AI models have been being used in education for years now. At No More Marking, all our national Comparative Judgement assessments are judged by humans, but we do use AI very successfully for back-office tasks. For example, one technical issue we face is that sometimes schools will upload their students’ writing to our website, and accidentally upload a couple of blank sheets of paper from absentee students. Just 1% of blank sheets of paper in an assessment with 50,000 responses can cause problems. We trained an AI model to recognise and remove blank sheets of paper from judging tasks. It works very well: it’s very accurate and saves us all a lot of time.

This is not a particularly glamorous use of AI, but it is reliable and helpful! We know that admin and data management tasks take up too much of teachers’ time, and we know that currently existing models of AI are very good at these kinds of tasks. So perhaps that’s where the most immediate gains are to be had.

How will AI affect assessment?

Rose and I largely agreed with each other about most things, but we did disagree about the impact AI will have on assessment. I think the arrival of ChatGPT means that coursework and non-examined written assessments will essentially become impossible to police, and that we need to look at replacing them with exams instead. I also think that we still need to teach and assess the skills that AI is good at - things like basic maths and reading - because humans cannot develop more advanced skills without mastering basic ones first. So I reject the idea that education & assessment should focus on the things AI cannot do.

The transcript of the debate will be published shortly and you can read Rose’s views on this! We hope to continue the conversation - it’s always productive to identify exact areas of agreement and disagreement.

What about plagiarism?

The second session of the day took evidence from Matt Glanville from the International Baccalaureate and Joel Kenyon, a secondary science teacher. The questioning focussed very closely on the impact ChatGPT will have on coursework. The IB’s position is that ChatGPT can be used as a source as long as it is referenced, and that their coursework tasks can continue as normal. They expect teachers to discuss issues of academic integrity with students and to recognise and deal with issues of unreferenced uses of ChatGPT.

My own take is that this is quite unrealistic and risks adding significantly to teacher workload. For something that has such significant implications on the validity of qualifications and on the workload of teachers, I would at least like to see some data or research on the ability of teachers to spot essays with significant AI plagiarism. Our own small-scale research showed that teachers were unable to do so. However, we were asking teachers to tell apart AI writing and the writing of real students they had never met before. Of course, if teachers know the students, we would expect them to make more correct decisions. But will they spot all of it? Will they get every decision right? How confident do you have to be in a judgement to accuse a student of plagiarism and require them to redo their work - both quite significant steps to take? Will they spot it even if students have been using it since the start of their course, so that they never get a true picture of the students’ baseline standard? These are all important questions that I feel deserve more research.

What does the future hold for AI in education?

There are a lot of questions that deserve a lot more research! We’ll keep doing what we can and publishing the results here.

No More Marking

What’s the one thing that needs changing about AI governance?

Our evidence to a UK Parliamentary Select Committee on the use of AI in schools

Discussion about this post