Surgical planning requires the identification of anatomical landmarks on medical images pre, intra, and post-procedures. Landmark detection on X-rays is thus an important, though time intensive, step to ensure good surgical outcomes. The operative workflow can be made more efficient by using machine learning approaches to automate the detection of landmarks. These models could also assist with the development of robotic surgical procedures, as robotic approaches to surgery require the robotic device to orient itself to the patient's anatomy. Previously described models are effective, but complex, making it difficult for landmark identification to occur in real-time in a robotic system. We propose a streamlined approach to landmark detection on pelvis radiographs that achieves similar accuracy to gold-standard manual annotations. To train and evaluate the network, 902 pelvic radiographs (382 Outlet view, 520 AP view) were annotated with 22 landmarks and split into training (n = 700), validation (n = 99), and testing (n = 103) set. A U-Net architecture with five encoding layers and five decoding layers was used. Landmark labels were converted into 128 × 128 × 22 heatmap array by placing a Gaussian blur filter (5% of image size) over each landmark in order to generate a single 128 × 128 layer of the array, thus generating the 22 layers of the overall heatmap array with one landmark heatmap on each layer. Each layer was an array of 0 except a circular Gaussian around the area of the landmark. A 256 × 256 array representing the image was fed into the U-Net architecture, which outputted a 128 × 128 × 22 heatmap array. Intersection over union loss was used to train the network. Following training for 100 epochs, the algorithm was able to predict the position of 22 clinically relevant landmarks with an average error of 2.36±.5 mm compared to ground-truth annotations made by trained experts and musculoskeletal radiologists with experience in pelvic scan analysis. The network was able to accurately determine the position of relevant landmarks. It was able to do this despite highly variable contrast and brightness in the dataset. Prediction accuracy was invariant to view (AP: 2.27±.51 mm, Outlet: 2.45±.64 mm). A prediction took 1.13±.07 seconds. The network is able to rapidly provide accurate femoral landmarks on a radiograph of the pelvis. This will allow for the automation of numerous clinically relevant measurements. Additionally, an important avenue of further research is exploring whether automated use of these measurements can aid in the prediction of femur fractures in machine learning models.
Read full abstract