Surgical instrument recognition for instrument usage documentation and surgical video library indexing

Bokai Zhang,Darrick Sturgeon,Arjun Ravi Shankar,Varun Kejriwal Goel,Jocelyn Barker,Amer Ghanem,Philip Lee,Meghan Milecky,Natalie Stottler,Svetlana Petculescu

doi:10.1080/21681163.2022.2152371

Abstract

ABSTRACT Temporally locating and classifying instruments in surgical video is useful for the analysis and comparison of surgical techniques. This paper aims to apply action segmentation techniques to temporally segment and classify surgical instruments, and to highlight the utility of this modelling approach through example applications. This paper shows that the action segmentation transformer (ASFormer) architecture with an EfficientNetV2 featurizer performs significantly better in mean average precision than any previous approaches to this task on the Cholec80 dataset. The ASFormer also outperforms Long Short-Term Memory (LSTM) and Multi-Stage Temporal Convolutional Network (MS-TCN) architectures with the same featurizer. This model reduces the need for costly human labelling of surgical video, driving the development of indexed surgical video libraries and instrument usage tracking applications. Examples of these applications are included after the results.

Full Text