In the last couple of weeks, I’ve given a few talks about improvements in artificial intelligence and what they mean for education.
Here you can download my slides from a big speech I gave at Eton College last month for the Inner Drive Teaching & Learning conference. And here are the slides from a Research Ed event yesterday.
I outlined the new AI marking features we’ve been working on at No More Marking. We think that hallucinations are a major weakness of Large Language Models, but that if you combine humans & LLMs in particular ways, you can mitigate the impact of hallucinations and produce useful and time-saving marking systems.
At the Eton event, one question I was asked at the end was how rapidly we thought AI would improve, and whether the flaws I’d outlined would soon be eliminated.
This is a very important question which we think about a lot. One big way in which LLMs have improved even in just the past few months is that they have got a lot cheaper. This makes a big difference to how you use them, as it means that certain use cases which were previously too expensive now become more realistic.
What about improvements in accuracy and in reducing the problems with hallucinations? I’d argue that the improvements here have not been as rapid as the improvements in cost.
So, my prediction is that significant cost reductions in LLMs are more likely than significant improvements in eliminating hallucinations.
Will technology only get better?
I’ve written before about how there is a pervasive idea that technology will inevitably always improve and so any flaw you spot with a technology doesn't really matter because it will soon be fixed.
You can obviously point to lots of examples where this is the case but you can equally clearly point to lots where it isn't. One of the most high profile, with global consequences, is energy production. In the 1950s serious people predicted that nuclear power would result in energy that was “too cheap to meter.” That has not happened. The specific technology of fusion power has been especially illusory.
Another problem with the “it will only get better” argument is that it forgets about the importance of time. Maybe it is inevitable that humanity does solve the fusion power problem, and that energy does end up being too cheap to meter. But it makes a huge difference if that process takes 6 months, 6 years or 60 years.
What does this mean for education?
I don’t think all of this uncertainty matters too much for what we teach in primary and most of secondary. As I have also written before, education is concerned with fundamental knowledge & skills like literacy and numeracy which don’t date, and which underpin all of the newer technologies. Cutting-edge technologies date faster than traditional technologies. The minidisc player and fax machine will be obsolete before the numbering system and alphabet.
However, this uncertainty does matter for adults in the world of work, and for students who are about to select a career or a vocational training course. You can’t just neatly sidestep it by saying “teach transferable skills”, either – because domain-general transferable skills don’t exist.
At the Eton Inner Drive conference, I asked the audience if they would recommend that a 17 year old living in England today should learn to drive. If you are convinced that self-driving cars are inevitable in the very near future, you should recommend that they don’t bother. Interestingly, most of the audience said the 17-year-olds should learn to drive.
Thinking about predictions in this very specific way is a useful discipline. Various superforecaster competitions ask participants to give the percentage likelihood of different world events. P(doom) has become a popular metric in the last few years - it expresses the likelihood of AI leading to a catastrophic outcome.
Making predictions like this is hard. In 2016, the famous computer scientist & Nobel Laureate Geoffrey Hinton said that “We should stop training radiologists now, it’s just completely obvious within five years deep learning is going to do better than radiologists.” We are nearly a decade on from that prediction, and we still need human radiologists.
Comparatively Judging the future
Traditional forecasting competitions ask participants to make absolute judgements, but we know that humans are better at making comparative judgements. If you want to predict the future and try out Comparative Judgement at the same time, have a go at our trial task here. We have created 20 statements that make a prediction about the future, and you have to judge them in pairs and say which of the two are more likely.
So, for example, you don’t have to decide how likely it is that lawyers or hairdressers will become obsolete. You just have to decide which is more likely.
We’ll share the overall results in a week or so’s time.