Exploring the importance of graders in determining pupils' examination results using cross-classified multilevel modelling

Tom Benton

14 September 2006

High stakes testing is a well established part of education around the world. Results of tests are used for a range of purposes including assessment of the extent to which national performance targets have been met, providing information about the performance of individual schools and informing the future teaching of pupils.

With such high stakes being placed on pupil examinations it is desirable to have an objective standard for grading against which all pupils are assessed (see Moss, 1994). However, in disciplines such as English where pupils are generally required to write longer answers or essays, graders are required to make subjective judgements about how well a question has been answered. Under such conditions maintaining consistency between graders may be difficult. Different graders may prefer different styles of writing or attach greater weight to different elements of a pupil’s answer. The aim of this paper is to examine the extent to which such variability can occur and to explore the relationship between this variability and the characteristics of pupils.

This paper was presented at the European Conference on Educational Research on 14 September 2006 at the University of Geneva.