Theory, Instruments, Scales and Modelling
The four critical aspects of measurement - plus table tapping
Recently I had the great pleasure of teaching the core model Psychometrics on the MSc in Educational Assessment at Oxford University.
Every day I began with a quiz covering the learning of the previous day, and every day I included the question: “What are the 4 critical aspects of measurement?” By day 5 the students could all tell me that they were ‘Theory, instruments, scales and modelling’1.
Why was I so insistent on them memorising these aspects? It is because every time they encounter someone doing some modelling, they should ask themselves,
“What theory, instruments and scales underpin this modelling?’
What do we mean by theory, and why is it important?
If we take the writing assessments we deliver at No More Marking, our theoretical framework is that a student’s ability to express themselves in a written form is of fundamental value in itself. Our theory does not extend to stating that writing is an expression of creativity, their ability to reason or even their suitability for future forms of education. Starting with theory ensures that you are careful with your inferences, leads you to avoid terms such as intelligence, creativity, merit or general ability that plagued psychometrics in the past.
Lacking a theoretical background and with grand ambitions you may find that all assessment results are correlated, which can lead you into testing how fast students can tap on a table in 10 seconds as as part of your selection testing. There may well be some plausible connection between psychophysics and ability to succeed in later education, but you need to start with the theory not the test or risk a school filled with furiously tapping pupils unable to perform algebraic manipulations!
What are our instruments?
The key is to choose assessment instruments as transparent as possible to your underlying theory. This is not easy: simply asking students to replicate a real-world task will mean assessing extraneous and irrelevant domains. But if you narrow the task down too much it loses touch with reality.
We see this problem with writing assessment. We could set open-ended research tasks that require students to spend a couple of weeks investigating a topic, writing it up, gathering feedback and redrafting their work. This might be valuable, but it is no longer a direct assessment of writing. Likewise, you can go to the other extreme and set tests like the KS2 grammar test. Again, this may have value, but it is not an assessment of writing.
At No More Marking, we assess writing using Comparative Judgement and our instruments are standardised writing prompts. Our writing prompt may unintentionally reward some aspect of performance or knowledge or lead to the dreaded blank page and tears, but we feel that on balance it is the most direct and transparent assessment of writing we have seen.
What are our scales?
With physical properties scales are fixed and meaningful. Weights and measures can be added and subtracted, distances multiplied and trajectories plotted to distant planets we have not yet visited. The comparison with physical properties is a useful and interesting one which we have discussed on this Substack before, but unfortunately scales in assessment are much harder to pin down and differences harder to interpret.
At No More Marking we introduced writing ages, a measure of difference between performance based on the empirical relationship between writing progress and age. Can you ever really compare the writing of a five year old with the writing of an eleven year old? Well, it would seem it would depend on our theory. If both are attempts to convey meaning through the written word, then yes, we can. What differentiates writing as students grow older changes as they progress from making marks to conveying tone and attitude. Teachers may be presented with the difficult task of comparing well constructed simple sentences against tone and attitude desperately lacking in syntax, but our underlying position of the fundamental importance of writing and its development throughout schooling allows us to use a value-laden term such as ‘writing age’.
… and lastly modelling …
… where so many of us in psychometrics feel comfortable, and from whence we would never emerge if not prompted by educationalists and psychologist. Psychometricians aim to fit models to data that increase the efficiency of our assessments and sharpen the inferences we can make. At No More Marking we use statistical models to reduce the time taken for teachers to judge and to enhance insights into the development of writing.
Dear students
So, dear students, ‘Theory, instruments, scales… and then modelling.’ If you remember nothing else from my course, remember to start with a credible theory and be wary of tapping on tables!
https://doi.org/10.1111/jedm.12350