A Review of Literature on Marking Reliability Research

Sarah Maughan, Jo Tisi, Gillian Whitehouse, Newman Burdett

07 June 2013

The focus of this review is marking reliability, which is affected by factors relating to the marking process, such as mark scheme design, individual marker behaviours and different marking processes. We review evidence of reliability at both item and whole paper level, as well as reports describing new methods for measuring reliability and new ways of improving the reliability of test and examination results. To this end the report aims to identify the main advances that have been made in improving and quantifying marking reliability. As such, this review of the literature forms part of the Ofqual Quality of Marking Project.

Key Findings

  • To make meaningful comparisons between findings from the different marking reliability studies, we need to have consensus on the terminology and statistical techniques used for measuring marking reliability.
  • Features on items and mark schemes can influence marking reliability: for example, the use of clearly specified mark schemes; having lower maximum marks; or clarifying the level of cognitive demands on markers may improve marking reliability.
  • On-screen marking makes item-level data more easily available; it also allows the marking process to be continuously monitored, which enables early detection and correction of inaccuracy. On-screen marking appears to be as reliable as paper-based marking, even for long answers or essay questions.
  • Item-level marking where more than one marker contributes to a pupil’s overall mark rather than a single marker assessing a whole script is more reliable, because it reduces examiner biases and random errors.
Read the report Read the article