Abstract

Video-recorded robotic-assisted surgeries allow the use of automated computer vision and artificial intelligence/deep learning methods for quality assessment and workflow analysis in surgical phase recognition. We considered a dataset of 209 videos of robotic-assisted laparoscopic inguinal hernia repair (RALIHR) collected from 8 surgeons, defined rigorous ground-truth annotation rules, then pre-processed and annotated the videos. We deployed seven deep learning models to establish the baseline accuracy for surgical phase recognition and explored four advanced architectures. For rapid execution of the studies, we initially engaged three dozen MS-level engineering students in a competitive classroom setting, followed by focused research. We unified the data processing pipeline in a confirmatory study, and explored a number of scenarios which differ in how the DL networks were trained and evaluated. For the scenario with 21 validation videos of all surgeons, the Video Swin Transformer model achieved ~0.85 validation accuracy, and the Perceiver IO model achieved ~0.84. Our studies affirm the necessity of close collaborative research between medical experts and engineers for developing automated surgical phase recognition models deployable in clinical settings.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call