Centre for Assessment
Information about assessment
Test scores cannot be directly compared or combined on the basis of raw scores alone. It is not meaningful to compare or add together raw scores from tests of a different type, of different length, of different time-limit or of different difficulty. On the other hand, standardised scores are suitable for this purpose and adding together standardised scores ensures that the tests all have equal weight.
Usually, tests are standardised so that the average standardised score automatically comes out as 100, irrespective of the difficulty of the test and so it is easy to see whether a pupil is above or below the average. The measure of the spread of scores is called the 'standard deviation' and this is usually set to 15 for educational attainment and ability tests. This means that, for example, irrespective of the difficulty of the test, about 68 per cent of the pupils in the national sample will have a standardised score within 15 points of the average (between 85 and 115) and about 96 per cent will have a standardised score within two standard deviations (30 points) of the average (between 70 and 130). These examples come from a frequency distribution known as 'the normal distribution', which is shown in the figure below. Published standardised scores usually range from 70 to 140 or from 69 to 141.
If no age allowance were to be applied to the standardised scores, then the equation for converting raw scores to standardised scores with a mean (average) score of 100 is
S = 15(b — a)/sd + 100
where S is the pupil’s standardised score, b is the pupil’s raw score, a is the average raw score of all the pupils, and sd is the standard deviation of the raw scores.
As an example, take a test of 80 questions. After the test has been administered and marked, the average (or ‘mean’) raw score, and standard deviation of these raw scores, are computed. The average score is 45 and the standard deviation is 12.5. For a pupil with a raw score of 55, the standardised score will be:
S = 15 x (55 — 45)/12.5 + 100 = 112
However, in order to allow for the differing ages of the pupils as accurately and as fairly as possible across the complete score range, the age-standardised scores are calculated in a much more statistically complex way, although the effect is similar to computing sets of scores using the above equation for pupils of the same age (to the nearest month).
Raw scores and standardised scores come from different scales, and are therefore not easily comparable with each other. An everyday example of this is the comparison of temperatures in degrees Fahrenheit and degrees Celsius. Fahrenheit temperatures above 32 degrees convert to positive numbers on the Celsius scale, whereas those below 32 degrees convert to negative numbers on the Celsius scale. The conversion of raw scores to age-standardised scores is much more statistically complex, though, than the conversion of Fahrenheit to Celsius. It actually depends on the level of difficulty of the test, the average score and the spread of scores in the test and on the relative levels of performance by pupils of differing ages.
It should be understood that scores expressed as percentages are never used. Unlike standardised scores, percentages cannot relate to the average performance of the pupils or to the extent of the variation in test score. Only by taking these into account can scores be places on a common scale.
What does a standardisation table look like?
An example of a table can be seen here. In order to be more easily readable, this example is based upon results from a test of only 40 questions to show how a standardisation table typically works.
Because standardised scores depend upon a pupil's raw score and age, a standardisation table is called a ‘two-way entry table’. In a column at the left-hand side of the table are the raw scores. Along the top of the table are the different ages - for example, 10:11 means 10 years and 11 months. As an illustration, a pupil aged 10:07 with a raw score of 12 will have a standardised score of 88 on this example test.
Two features of standardisation tables can be seen in this example:
1) As one moves along a row from left to right (i.e. as the age increases), the standardised scores decrease slightly. This is the age allowance at work, compensating for the fact mentioned earlier that, almost invariably, younger pupils score slightly lower on average. The rate at which standardised scores decrease with increasing age will vary from one test to another, and therefore the pattern observed in this table may well be different to that applicable to other tables.
2) The inclusion of this age allowance means that a younger pupil can achieve the same standardised score as an older pupil whilst having a slightly lower raw score. As stated before, this is in order not to disadvantage summer-born children in comparison to pupils who happen to have been born, say, in the previous autumn. An important consequence of this is that, in whatever month pupils were born, roughly the same proportion will achieve the specified pass mark. This is because pupils are, in effect, only being compared with other pupils of the same age as themselves.
As one moves down any particular column, the standardised scores increase. This means quite simply that higher raw scores will result in higher standardised scores.