Crowd-Sourced and Attending Assessment of General Surgery Resident Operative Performance Using Global Ratings Scales
Shanley B. Deal, MD, Rebecca E. Scully, MD, MPH, Gregory Wnuk, MHSA, Brian C. George, MD, MAEd, and Adnan A. Alseidi, MD, EdM

Journal of Surgical Education, 2020




We sought to assess the extent to which both crowd and intraoperative attending ratings using objective structured assessment of technical skill (OSATS) or global objective assessment of laparoscopic skills (GOALS) would correlate with the system for improving procedural learning (SIMPL) Zwisch and Performance scales.


Comparison of directly observed versus crowd sourced review of operative video.


Operative video captured at 2 institutions.


Six (6) core general surgery procedures, 3 open and 3 laparoscopic, were selected from the American Board of Surgery’s Resident Assessments list. Thirty-two cases performed by General Surgery residents across all training levels at 2 institutions were filmed. Videos were condensed using a standardized protocol to include the critical portion of the procedure. Condensed videos were then submitted to crowd- sourced assessment of technical skills (C-SATS), an online crowd source-driven assessment service, for assessment using the appropriate resident assessment form (GOALS or OSATS) as well as with the SIMPL Zwisch and Performance scales. Crowd workers watched an educational tutorial on how to use the Zwisch and SIMPL Performance rating scales prior to participating. Attendings scored residents using the same tools immediately after the shared operative experience. Statistical analysis was performed using Pearson’s correlation coefficient.


Crowd raters evaluated 32 procedures using GOALS/OSATS, Zwisch and Performance (35-50 ratings per video). Attendings also evaluated all 32 procedures using GOALS/OSATS and 26 of the procedures using SIMPL Zwisch and Performance. Pearson correlation coefficients with 95% confidence intervals for crowd ratings were: GOALS and Zwisch 0.40 [0.73 to 0.10], OSATS and Zwisch 0.11 [0.41 to 0.57], GOALS and Performance 0.06 [0.44 to 0.35], and OSATS and Performance 0.22 [0.46 to 0.20]. Pearson correlation coefficients for attendings were: GOALS and Zwisch (0.77), OSATS and Zwisch (0.65), GOALS and Performance (0.93), and OSATS and Performance (0.59).


Overall, correlations between crowd- sourced ratings using GOALS/OSATS and SIMPL global operative performance ratings tools were weak, yet for attendings, they were strong. Direct attending assessment may be required for evaluation of global performance while crowd sourcing may be more suitable for technical assessment. Further studies are needed to see if more extensive crowd training would result in improved ability for global performance evaluation.



