Introduction to standard setting

Michael Pollitt
Michael Pollitt
  • Updated

The term standard setting refers to the process of determining the minimum level of competence required to pass the exam. Organisations use standard setting methodologies as a means to define a defensible passing standard.

Methods for determining the passing standard fall broadly into two categories:

  • Norm-referencing methods: Where a predetermined proportion of candidates pass and fail and so the pass mark is decided relative to the overall performance of the cohort.
  • Criterion-referencing methods: Where the standard is decided based on an observed or estimable attribute of the test. Criterion referencing can be further subdivided into:
    • Candidate-centered methods: Uses an expert judgement of the level of minimal competence based the candidates observed in the exam.
    • Test-centered methods: Uses an expert estimate of minimal competence based on the items in the exam.

risr/ assess supports the following standard setting methodologies with built in algorithms and processes to support their application in the system.

  • The Angoff method involves asking a set of expert judges to evaluate the items in the paper and make an estimate of the proportion of minimally competent candidates they would expect to answer the question correctly.

    Practices can vary, but commonly the expert panel will meet to discuss their judgements and make any adjustments that are required if significant discrepancies are found. Once that process has been conducted, the average of each judge's decision is recorded as the combined item estimate. The overall average of those item estimates then becomes the passing score for the exam.

    Angoff is commonly used for written exams and, in particular, works well where the exam is made up of items with a dichotomous outcome (correct/incorrect) such as single best answer (SBA)

  • The benchmark method allows organisations using norm-referencing approaches to visualise and apply a set of predetermined benchmarks to the cohort performance to determine the passing score. The following values can be applied by default:

    Benchmark Outcome
    Mean Pass mark is set at the cohort mean score 
    Mean - 1 SD Pass mark is set at the mean minus one standard deviation
    Mean - 2 SD Pass mark is set at the mean minus two standard deviations
    Median Pass mark is set at the cohort median score
    90% passrate Pass mark is set at the point where 90% of candidates pass the exam (and 10% fail)
    95% passrate Pass mark is set at the point where 95% of candidates pass the exam (and 5% fail)

    It is also possible to input your own benchmarks to visualise against the score distribution and apply as the passing standard. 

    One advantage to the benchmarking functionality in risr/ assess is that it allows for bimodal or multimodal cohorts to be considered. For example, if candidates in different year groups are being examined in the same event, the cohort can be segregated and different standards applied to each. 

  • The Ebel method is another test-centred approach. It is similar to Angoff in its application, only expert judges are asked to score in a matrix format with the addition of a relevance parameter.

    The first step is to establish the Ebel matrix and apply probability values to each cell. A common approach is to use a 3x3 matrix with difficulty on one axis and relevance on the other. Each cell then receives a numerical value representing an estimate of the probability that a minimally competent candidate would answer the question correctly.

    The finished matrix may look something like the below:

      Acceptable Important Essential
    Easy 30% 60% 90%
    Medium 20% 50% 80%
    Hard 10% 40% 70%

    Judges are asked to provide both a difficulty and relevance score on the three point scale (without visibility of the underlying probability values). In a similar manner to the Angoff process, judging panels may meet to discuss, but eventually the averages are computed and the outcome becomes the cut score for the exam.

    risr/ assess has a built-in mechanism for Ebel scoring where panels can be invited to provide their judgments directly in the system.

  • The borderline groups method is a candidate-centred method whereby expert judges (i.e. examiners) are asked to identify the performance of a "borderline" competent candidate. This is usually conducted by adding a global judgement field to the mark sheet and asking examiners to provide an overall judgement of the candidate's performance which is separate to the scored criteria (checklist or domains) on the mark sheet.

    Once all data has been collected, the median station score for the borderline group of candidates is computed and this becomes the passing score for the station.

  • Similar to borderline groups, borderline regression requires examiners to provide a judgement of what constitutes a "borderline" competent candidate during the exam using a separate global judgement entry on the mark sheet. The difference, however, is how the data is processed.

    Where borderline groups considers only the borderline group of candidates in the calculations, the borderline regression algorithm takes all of the recorded candidate observations and produces a regression plot with the domain or checklist scores on the y axis and the global judgements on the x axis. A line of best fit is added and the point at which the borderline judgments intersect the total scores axis becomes the cut score for the station.

  • Adjustable borderline regression behaves in exactly the same way as the borderline regression option, however, using this method, it is possible to adjust the intersect point for each station.

  • The McManus borderline regression algorithm uses the lower confidence interval of the regression line to calculate the cut score for the individual stations. An upper confidence interval is then applied to the aggregate passing score. 

How to Configure the Instance

Standard setting methods are a global configuration option that can only be set in the background and by risr/ teams.The options below show which standard setting methods can be selected or deselected on your instance for either OSCE or Written exams. For further information regarding these and other configurable options please the Site configuration options page.

Screenshot 2025-03-05 at 10.26.26.png

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.