Multi-perspective region-based CNNs for vertebrae labeling in intraoperative long-length images

Y Huang,C.K Jones,X Zhang,A Johnston,S Waktola,N Aygun,T.F Witham,A Bydon,N Theodore,P.A Helm,J.H Siewerdsen,A Uneri

doi:10.1016/j.cmpb.2022.107222

Abstract

PurposeEffective aggregation of intraoperative x-ray images that capture the patient anatomy from multiple view-angles has the potential to enable and improve automated image analysis that can be readily performed during surgery. We present multi-perspective region-based neural networks that leverage knowledge of the imaging geometry for automatic vertebrae labeling in Long-Film images – a novel tomographic imaging modality with an extended field-of-view for spine imaging. MethodA multi-perspective network architecture was designed to exploit small view-angle disparities produced by a multi-slot collimator and consolidate information from overlapping image regions. A second network incorporates large view-angle disparities to jointly perform labeling on images from multiple views (viz., AP and lateral). A recurrent module incorporates contextual information and enforce anatomical order for the detected vertebrae. The three modules are combined to form the multi-view multi-slot (MVMS) network for labeling vertebrae using images from all available perspectives. The network was trained on images synthesized from 297 CT images and tested on 50 AP and 50 lateral Long-Film images acquired from 13 cadaveric specimens. Labeling performance of the multi-perspective networks was evaluated with respect to the number of vertebrae appearances and presence of surgical instrumentation. ResultsThe MVMS network achieved an F1 score of >96% and an average vertebral localization error of 3.3 mm, with 88.3% labeling accuracy on both AP and lateral images – (15.5% and 35.0% higher than conventional Faster R-CNN on AP and lateral views, respectively). Aggregation of multiple appearances of the same vertebra using the multi-slot network significantly improved the labeling accuracy (p < 0.05). Using the multi-view network, labeling accuracy on the more challenging lateral views was improved to the same level as that of the AP views. The approach demonstrated robustness to the presence of surgical instrumentation, commonly encountered in intraoperative images, and achieved comparable performance in images with and without instrumentation (88.9% vs. 91.2% labeling accuracy). ConclusionThe MVMS network demonstrated effective multi-perspective aggregation, providing means for accurate, automated vertebrae labeling during spine surgery. The algorithms may be generalized to other imaging tasks and modalities that involve multiple views with view-angle disparities (e.g., bi-plane radiography). Predicted labels can help avoid adverse events during surgery (e.g., wrong-level surgery), establish correspondence with labels in preoperative modalities to facilitate image registration, and enable automated measurement of spinal alignment metrics for intraoperative assessment of spinal curvature.

Full Text