Abstract
ABSTRACT Temporally locating and classifying instruments in surgical video is useful for the analysis and comparison of surgical techniques. This paper aims to apply action segmentation techniques to temporally segment and classify surgical instruments, and to highlight the utility of this modelling approach through example applications. This paper shows that the action segmentation transformer (ASFormer) architecture with an EfficientNetV2 featurizer performs significantly better in mean average precision than any previous approaches to this task on the Cholec80 dataset. The ASFormer also outperforms Long Short-Term Memory (LSTM) and Multi-Stage Temporal Convolutional Network (MS-TCN) architectures with the same featurizer. This model reduces the need for costly human labelling of surgical video, driving the development of indexed surgical video libraries and instrument usage tracking applications. Examples of these applications are included after the results.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.