Abstract

Real-time prediction of the remaining surgery duration (RSD) is important for optimal scheduling of resources in the operating room. We focus on the intraoperative prediction of RSD from laparoscopic video. An extensive evaluation of seven common deep learning models, a proposed one based on the Transformer architecture (TransLocal) and four baseline approaches, is presented. The proposed pipeline includes a CNN-LSTM for feature extraction from salient regions within short video segments and a Transformer with local attention mechanisms. Using the Cholec80 dataset, TransLocal yielded the best performance (mean absolute error (MAE)=7.1min). For long and short surgeries, the MAE was 10.6 and 4.4min, respectively. Thirty minutes before the end of surgery MAE=6.2min, 7.2 and 5.5min for all long and short surgeries, respectively. The proposed technique achieves state-of-the-art results. In the future, we aim to incorporate intraoperative indicators and pre-operative data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call