Abstract

BackgroundThis study is to propose a clinically applicable 2-echelon (2e) diagnostic criteria for the analysis of thyroid nodules such that low-risk nodules are screened off while only suspicious or indeterminate ones are further examined by histopathology, and to explore whether artificial intelligence (AI) can provide precise assistance for clinical decision-making in the real-world prospective scenario.MethodsIn this prospective study, we enrolled 1036 patients with a total of 2296 thyroid nodules from three medical centers. The diagnostic performance of the AI system, radiologists with different levels of experience, and AI-assisted radiologists with different levels of experience in diagnosing thyroid nodules were evaluated against our proposed 2e diagnostic criteria, with the first being an arbitration committee consisting of 3 senior specialists and the second being cyto- or histopathology.ResultsAccording to the 2e diagnostic criteria, 1543 nodules were classified by the arbitration committee, and the benign and malignant nature of 753 nodules was determined by pathological examinations. Taking pathological results as the evaluation standard, the sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (AUC) of the AI systems were 0.826, 0.815, 0.821, and 0.821. For those cases where diagnosis by the Arbitration Committee were taken as the evaluation standard, the sensitivity, specificity, accuracy, and AUC of the AI system were 0.946, 0.966, 0.964, and 0.956. Taking the global 2e diagnostic criteria as the gold standard, the sensitivity, specificity, accuracy, and AUC of the AI system were 0.868, 0.934, 0.917, and 0.901, respectively. Under different criteria, AI was comparable to the diagnostic performance of senior radiologists and outperformed junior radiologists (all P < 0.05). Furthermore, AI assistance significantly improved the performance of junior radiologists in the diagnosis of thyroid nodules, and their diagnostic performance was comparable to that of senior radiologists when pathological results were taken as the gold standard (all p > 0.05).ConclusionsThe proposed 2e diagnostic criteria are consistent with real-world clinical evaluations and affirm the applicability of the AI system. Under the 2e criteria, the diagnostic performance of the AI system is comparable to that of senior radiologists and significantly improves the diagnostic capabilities of junior radiologists. This has the potential to reduce unnecessary invasive diagnostic procedures in real-world clinical practice.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call