TY - JOUR
T1 - Reliable Assessment of Surgical Technical Skills Is Dependent on Context
T2 - An Exploration of Different Variables Using Generalizability Theory
AU - Andersen, Steven Arild Wuyts
AU - Park, Yoon Soo
AU - Sørensen, Mads Sølvsten
AU - Konge, Lars
PY - 2020/12
Y1 - 2020/12
N2 - PURPOSE: Reliable assessment of surgical skills is vital for competency-based medical training. Several factors influence not only the reliability of judgments but also the number of observations needed for making judgments of competency that are both consistent and reproducible. The aim of this study was to explore the role of various conditions-through the analysis of data from large-scale, simulation-based assessments of surgical technical skills-by examining the effects of those conditions on reliability using generalizability theory.METHOD: Assessment data from large-scale, simulation-based temporal bone surgical training research studies in 2012-2018 were pooled, yielding collectively 3,574 assessments of 1,723 performances. The authors conducted generalizability analyses using an unbalanced random-effects design, and they performed decision studies to explore the effect of the different variables on projections of reliability.RESULTS: Overall, 5 observations were needed to achieve a generalizability coefficient > 0.8. Several variables modified the projections of reliability: increased learner experience necessitated more observations (5 for medical students, 7 for residents, and 8 for experienced surgeons), the more complex cadaveric dissection required fewer observations than virtual reality simulation (2 vs 5 observations), and increased fidelity simulation graphics reduced the number of observations needed from 7 to 4. The training structure (either massed or distributed practice) and simulator-integrated tutoring had little effect on reliability. Finally, more observations were needed during initial training when the learning curve was steepest (6 observations) compared with the plateau phase (4 observations).CONCLUSIONS: Reliability in surgical skills assessment seems less stable than it is often reported to be. Training context and conditions influence reliability. The findings from this study highlight that medical educators should exercise caution when using a specific simulation-based assessment in other contexts.
AB - PURPOSE: Reliable assessment of surgical skills is vital for competency-based medical training. Several factors influence not only the reliability of judgments but also the number of observations needed for making judgments of competency that are both consistent and reproducible. The aim of this study was to explore the role of various conditions-through the analysis of data from large-scale, simulation-based assessments of surgical technical skills-by examining the effects of those conditions on reliability using generalizability theory.METHOD: Assessment data from large-scale, simulation-based temporal bone surgical training research studies in 2012-2018 were pooled, yielding collectively 3,574 assessments of 1,723 performances. The authors conducted generalizability analyses using an unbalanced random-effects design, and they performed decision studies to explore the effect of the different variables on projections of reliability.RESULTS: Overall, 5 observations were needed to achieve a generalizability coefficient > 0.8. Several variables modified the projections of reliability: increased learner experience necessitated more observations (5 for medical students, 7 for residents, and 8 for experienced surgeons), the more complex cadaveric dissection required fewer observations than virtual reality simulation (2 vs 5 observations), and increased fidelity simulation graphics reduced the number of observations needed from 7 to 4. The training structure (either massed or distributed practice) and simulator-integrated tutoring had little effect on reliability. Finally, more observations were needed during initial training when the learning curve was steepest (6 observations) compared with the plateau phase (4 observations).CONCLUSIONS: Reliability in surgical skills assessment seems less stable than it is often reported to be. Training context and conditions influence reliability. The findings from this study highlight that medical educators should exercise caution when using a specific simulation-based assessment in other contexts.
KW - Clinical Competence
KW - Humans
KW - Learning Curve
KW - Orthopedic Procedures/education
KW - Reproducibility of Results
KW - Simulation Training
KW - Temporal Bone/surgery
UR - http://www.scopus.com/inward/record.url?scp=85096886868&partnerID=8YFLogxK
U2 - 10.1097/ACM.0000000000003550
DO - 10.1097/ACM.0000000000003550
M3 - Journal article
C2 - 32590473
SN - 1040-2446
VL - 95
SP - 1929
EP - 1936
JO - Academic medicine : journal of the Association of American Medical Colleges
JF - Academic medicine : journal of the Association of American Medical Colleges
IS - 12
ER -