Back to Search Start Over

Inter-rater Reliability for Metrics Scored in a Binary Fashion-Performance Assessment for an Arthroscopic Bankart Repair.

Authors :
Gallagher AG
Ryu RKN
Pedowitz RA
Henn P
Angelo RL
Source :
Arthroscopy : the journal of arthroscopic & related surgery : official publication of the Arthroscopy Association of North America and the International Arthroscopy Association [Arthroscopy] 2018 Jul; Vol. 34 (7), pp. 2191-2198. Date of Electronic Publication: 2018 May 02.
Publication Year :
2018

Abstract

Purpose: To determine the inter-rater reliability (IRR) of a procedure-specific checklist scored in a binary fashion for the evaluation of surgical skill and whether it meets a minimum level of agreement (≥0.8 between 2 raters) required for high-stakes assessment.<br />Methods: In a prospective randomized and blinded fashion, and after detailed assessment training, 10 Arthroscopy Association of North America Master/Associate Master faculty arthroscopic surgeons (in 5 pairs) with an average of 21 years of surgical experience assessed the video-recorded 3-anchor arthroscopic Bankart repair performance of 44 postgraduate year 4 or 5 residents from 21 Accreditation Council for Graduate Medical Education orthopaedic residency training programs from across the United States.<br />Results: No paired scores of resident surgeon performance evaluated by the 5 teams of faculty assessors dropped below the 0.8 IRR level (mean = 0.93; range 0.84-0.99; standard deviation = 0.035). A comparison between the 5 assessor groups with 1 factor analysis of variance showed that there was no significant difference between the groups (P = .205). Pearson's product-moment correlation coefficient revealed a strong and statistically significant negative correlation, that is, -0.856 (P < .000), indicating that as intra-operative error rate scores increased, the IRR decreased.<br />Conclusions: Arthroscopy Association of North America shoulder faculty raters from across the United States showed high levels of IRR in the assessment of an arthroscopic 3-anchor Bankart repair procedure. All paired assessments were above the 0.8 level and the mean IRR of all resident assessments was 0.93, indicating that they could be used for high-stakes decisions.<br />Clinical Relevance: With the move toward outcomes-based performance evaluation for graduate medical education, high-stakes assessments of surgical skill will require robust, reliable measurement tools that are able to withstand challenge. Surgical checklists employing metrics scored in a binary fashion meet the need and can show a high (>80%) IRR.<br /> (Copyright © 2018 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.)

Details

Language :
English
ISSN :
1526-3231
Volume :
34
Issue :
7
Database :
MEDLINE
Journal :
Arthroscopy : the journal of arthroscopic & related surgery : official publication of the Arthroscopy Association of North America and the International Arthroscopy Association
Publication Type :
Academic Journal
Accession number :
29730215
Full Text :
https://doi.org/10.1016/j.arthro.2018.02.007