1. Modeling Surgical Technical Skill Using Expert Assessment for Automated Computer Rating
- Author
-
Sudha R. Pavuluri Quamme, Caprice C. Greenberg, David P. Azari, Robert G. Radwin, Carla M. Pugh, Lane L. Frasier, and Jacob A. Greenberg
- Subjects
Male ,Mean squared error ,Video Recording ,Machine learning ,computer.software_genre ,Article ,Motion (physics) ,Tissue handling ,Task (project management) ,03 medical and health sciences ,0302 clinical medicine ,Artificial Intelligence ,Task Performance and Analysis ,Surgical technical ,Humans ,Medicine ,030212 general & internal medicine ,CLIPS ,computer.programming_language ,Observer Variation ,business.industry ,Tying ,Suture Techniques ,Reproducibility of Results ,Models, Theoretical ,Hand ,Biomechanical Phenomena ,Computer algorithm ,030220 oncology & carcinogenesis ,Female ,Surgery ,Clinical Competence ,Artificial intelligence ,business ,computer ,Algorithms - Abstract
OBJECTIVE: Computer vision was used to predict expert performance ratings from surgeon hand motions for tying and suturing tasks. SUMMARY BACKGROUND DATA: Existing methods, including the objective structured assessment of technical skills (OSATS) have proven reliable, but do not readily discriminate at the task level. Computer vision may be used for evaluating distinct task performance throughout an operation. METHODS: Open surgeries were videoed and surgeon hands were tracked without using sensors or markers. An expert panel of three attending surgeons rated tying and suturing video clips on continuous scales from 0 to 10 along three task measures adapted from the broader OSATS: motion economy, fluidity of motion, and tissue handling. Empirical models were developed to predict the expert consensus ratings based on the hand kinematic data records. RESULTS: The predicted v. panel ratings for suturing had slopes from 0.73 to 1, and intercepts from 0.36 to 1.54 (Average R(2) = 0.81). Predicted v. panel ratings for tying had slopes from 0.39 to 0.88, and intercepts from 0.79 to 4.36 (Average R(2) = 0.57). The mean square error among predicted and expert ratings were consistently less than the mean squared difference among individual expert ratings and the eventual consensus ratings. CONCLUSIONS: The computer algorithm consistently predicted the panel ratings of individual tasks, and were more objective and reliable than individual assessment by surgical experts.
- Published
- 2019