This study presents machine learning (ML) models that predict if deep inspiration breath hold (DIBH) is needed based on lung dose in right-sided breast cancer patients during the initial computed tomography (CT) appointment. 
Materials and methods. 
Anatomic distances were extracted from a single-institution dataset of free breathing (FB) CT scans from locoregional right-sided breast cancer patients. Models were developed using combinations of anatomic distances and ML classification algorithms (gradient boosting, k-nearest neighbors, logistic regression, random forest, and support vector machine) and optimized over 100 iterations using stratified 5-fold cross-validation. Models were grouped by the number of anatomic distances used during development; those with the highest validation accuracy were selected as final models. Final models were compared based on their predictive ability, measurement collection efficiency, and robustness to simulated user error during measurement collection. 
Results. 
This retrospective study included 238 patients treated between 2016 and 2021. Model development ended once eight anatomic distances were included, and the validation accuracy plateaued. The best performing model used logistic regression with four anatomic distances achieving 80.5% average testing accuracy, with minimal false negatives and positives (< 27%). The anatomic distances required for prediction were collected within 3 minutes and were robust to simulated user error during measurement collection, changing accuracy by < 5%. 
Conclusion. 
Our logistic regression model using four anatomic distances provided the best balance between efficiency, robustness, and ability to predict if DIBH was needed for locoregional right-sided breast cancer patients.
.
Read full abstract