Abstract

Artificial intelligence (AI) has numerous applications in radiology. Clinical research studies to evaluate the AI models are also diverse. Consequently, diverse outcome metrics and measures are employed in the clinical evaluation of AI, presenting a challenge for clinical radiologists. This review aims to provide conceptually intuitive explanations of the outcome metrics and measures that are most frequently used in clinical research, specifically tailored for clinicians. While we briefly discuss performance metrics for AI models in binary classification, detection, or segmentation tasks, our primary focus is on less frequently addressed topics in published literature. These include metrics and measures for evaluating multiclass classification; those for evaluating generative AI models, such as models used in image generation or modification and large language models; and outcome measures beyond performance metrics, including patient-centered outcome measures. Our explanations aim to guide clinicians in the appropriate use of these metrics and measures.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.