Abstract

Clinical decision support systems use image processing and machine learning methods to objectively predict cancer in histopathological images. Integral to the development of machine learning classifiers is the ability to generalize from training data to unseen future data. A classification model's ability to accurately predict class label for new unseen data is measured by performance metrics, which also informs the classifier model selection process. Based on our research, commonly used metrics in literature (such as accuracy, ROC curve) do not accurately reflect the trained model's robustness. To the best of our knowledge, no research has been conducted to quantitatively compare performance metrics in the context of cancer prediction in histopathological images. In this paper, we evaluate various performance metrics and show that the Lift metric has the highest correlation between internal and external validation sets of a nested cross validation pipeline (R(2) = 0.57). Thus, we demonstrate that the Lift metric best generalizes classifier performance among the 23 metrics that were evaluated. Using the lift metric, we develop a classifier with a misclassification rate of 0.25 (4-class classifier) for data that the model was not trained on (external validation).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call