Abstract
To develop a deep-learning algorithm for anterior cruciate ligament (ACL) tear detection and to compare its accuracy using two external datasets. A database of 19,765 knee MRI scans (17,738 patients) issued from different manufacturers and magnetic fields was used to build a deep learning-based ACL tear detector. Fifteen percent showed partial or complete ACL rupture. Coronal and sagittal fat-suppressed proton density or T2-weighted sequences were used. A Natural Language Processing algorithm was used to automatically label reports associated with each MRI exam. We compared the accuracy of our model on two publicly available external datasets: MRNet, Bien et al, USA (PLoS Med 15:e1002699, 2018); and KneeMRI, Stajduhar et al, Croatia (Comput Methods Prog Biomed 140:151-164, 2017). Receptor operating characteristics (ROC) curves, area under the curve (AUC), sensitivity, specificity, and accuracy were used to evaluate our model. Our neural networks achieved an AUC value of 0.939 for detection of ACL tears, with a sensitivity of 87% (0.875) and a specificity of 91% (0.908). After retraining our model on Bien dataset and Stajduhar dataset, our algorithm achieved AUC of 0.962 (95% CI 0.930-0.988) and 0.922 (95% CI 0.875, 0.962) respectively. Sensitivity, specificity, and accuracy were respectively 85% (95% CI 75-94%, 0.852), 89% (95% CI 82-97%, 0.894), 0.875 (95% CI 0.817-0.933) for Bien dataset, and 68% (95% CI 54-81%, 0.681), 93% (95% CI 89-97%, 0.934), and 0.870 (95% CI 0.821-0.913) for Stajduhar dataset. Our algorithm showed high performance in the detection of ACL tears with AUC on two external datasets, demonstrating its generalizability on different manufacturers and populations. This study shows the performance of an algorithm for detecting anterior cruciate ligament tears with an external validation on populations from countries and continents different from the study population. • An algorithm for detecting anterior cruciate ligament ruptures was built from a large dataset of nearly 20,000 MRI with AUC values of 0.939, sensitivity of 87%, and specificity of 91%. • This algorithm was tested on two external populations from different other countries: a dataset from an American population and a dataset from a Croatian population. Performance remains high on these two external validation populations (AUC of 0.962 and 0.922 respectively).
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.