Abstract

PurposeSurgical gesture recognition has been an essential task for providing intraoperative context-aware assistance and scheduling clinical resources. However, previous methods present limitations in catching long-range temporal information, and many of them require additional sensors. To address these challenges, we propose a symmetric dilated network, namely SD-Net, to jointly recognize surgical gestures and assess surgical skill levels only using RGB surgical video sequences.MethodsWe utilize symmetric 1D temporal dilated convolution layers to hierarchically capture gesture clues under different receptive fields such that features in different time span can be aggregated. In addition, a self-attention network is bridged in the middle to calculate the global frame-to-frame relativity.ResultsWe evaluate our method on a robotic suturing task from the JIGSAWS dataset. The gesture recognition task largely outperforms the state of the arts on the frame-wise accuracy up to sim 6 points and the F1@50 score sim 8 points. We also keep the 100% predicted accuracy for the skill assessment task using LOSO validation scheme.ConclusionThe results indicate that our architecture is able to obtain representative surgical video features by extensively considering the spatial, temporal and relational context from raw video input. Furthermore, the better performance in multi-task learning implies that surgical skill assessment has a complementary effects to gesture recognition task.

Highlights

  • There has been a growing interest in building contextaware system (CAS) utilizing available information inside the operation room (OR) to provide clinicians with contextual support

  • The results prove that the multi-task architecture improves the performance of surgical gesture recognition without any additional human annotation

  • Evaluation Metrics: For surgical gesture recognition evaluation, we evaluate our method on both frame level and segmental level, where the frame-wise accuracy edit score and segmented F1 score are, respectively, used as the evaluation metrics

Read more

Summary

Introduction

There has been a growing interest in building contextaware system (CAS) utilizing available information inside the operation room (OR) to provide clinicians with contextual support. It allows various applications through the whole patient care pathway, such as clinical resources scheduling and report generation [15]. Surgical activities share similar environment due to the similar appearance, color and texture of human anatomic structure. Surgical process is specific to the medical condition, the surgeon and the patient, such that the process varies significantly from one to another.

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call