Selecting the appropriate deep learning (DL) model for healthcare research poses a significant challenge due to the diversity of evaluation criteria and the complex nature of health-related tasks, where a single metric like accuracy is often insufficient. Motivated by the need for a structured, multi-criteria approach, this study proposes a Multi-Criteria Decision Analysis (MCDA) framework using the Analytic Hierarchy Process (AHP). Our primary contribution is the development of a comprehensive decision-making framework that integrates multiple evaluation criteria, such as accuracy, sensitivity, specificity, and computational complexity, alongside empirical data from existing literature to systematically compare DL models. The framework was validated through a use case involving the selection of the best DL model for diagnosing COVID-19 using X-ray images, where we compared eight popular models, including ResNet34, SqueezeNet, and AlexNet, and it was also evaluated through comparative scenarios using traditional methods, including weighted sum, weighted average, and accuracy-based evaluation. Quantitative results show that SqueezeNet achieved the highest score in the AHP framework (88.64), while ResNet34 performed best in traditional methods such as weighted sum (588.49) and accuracy ranking (98.33%). A sensitivity analysis further demonstrated the impact of varying criteria weights, showing how changes in the importance of accuracy and precision, influenced model ranking. These findings highlight the flexibility and robustness of the AHP framework in addressing the complexities of model selection in healthcare research. The implications of this work suggest that a structured, data-driven evaluation approach can provide more nuanced and reliable insights compared to traditional methods like single-metric evaluations, ultimately supporting more informed decision-making in healthcare applications.