Surgical workflow recognition with 3DCNN for Sleeve Gastrectomy

Bokai Zhang,Henry Choi,Andrew Yoo,Alexander Simes,Amer Ghanem

doi:10.1007/s11548-021-02473-3

Abstract

PurposeSurgical workflow recognition is a crucial and challenging problem when building a computer-assisted surgery system. Current techniques focus on utilizing a convolutional neural network and a recurrent neural network (CNN–RNN) to solve the surgical workflow recognition problem. In this paper, we attempt to use a deep 3DCNN to solve this problem.MethodsIn order to tackle the surgical workflow recognition problem and the imbalanced data problem, we implement a 3DCNN workflow referred to as I3D-FL-PKF. We utilize focal loss (FL) to train a 3DCNN architecture known as Inflated 3D ConvNet (I3D) for surgical workflow recognition. We use prior knowledge filtering (PKF) to filter the recognition results.ResultsWe evaluate our proposed workflow on a large sleeve gastrectomy surgical video dataset. We show that focal loss can help to address the imbalanced data problem. We show that our PKF can be used to generate smoothed prediction results and improve the overall accuracy. We show that the proposed workflow achieves 84.16% frame-level accuracy and reaches a weighted Jaccard score of 0.7327 which outperforms traditional CNN–RNN design.ConclusionThe proposed workflow can obtain consistent and smooth predictions not only within the surgical phases but also for phase transitions. By utilizing focal loss and prior knowledge filtering, our implementation of deep 3DCNN has great potential to solve surgical workflow recognition problems for clinical practice.

Highlights

Computer-assisted surgery (CAS) system is one of the cornerstones for modern operating rooms
In order to quantify the improvement caused by using the Inflated 3D ConvNet (I3D) as the deep network architecture, a similar CNN–RNN workflow was implemented with InceptionV3-bidirectional LSTM (BiLSTM) as a replacement for I3D
I3D outperforms C3D by around 2.5% from the accuracy aspect which demonstrates the importance of utilizing deep 3DCNN in the workflow

Summary

Methods

In order to tackle the surgical workflow recognition problem and the imbalanced data problem, we implement a 3DCNN workflow referred to as I3D-FL-PKF. We utilize focal loss (FL) to train a 3DCNN architecture known as Inflated 3D ConvNet (I3D) for surgical workflow recognition. We use prior knowledge filtering (PKF) to filter the recognition results

Results

Conclusion

Introduction

Method

Experiments