Abstract

Simple SummaryBatch harmonization of radiomic features extracted from magnetic resonance images of breast lesions from two databases was applied to an artificial intelligence/machine learning classification workflow. Training and independent test sets from the two databases, as well as the combination of them, were used in pre-harmonization and post-harmonization forms to investigate the generalizability of performance in the task of distinguishing between malignant and benign lesions. Most training and independent test scenarios were statistically equivalent, demonstrating that batch harmonization with feature selection harmonization can potentially develop generalizable classification models.Radiomic features extracted from medical images may demonstrate a batch effect when cases come from different sources. We investigated classification performance using training and independent test sets drawn from two sources using both pre-harmonization and post-harmonization features. In this retrospective study, a database of thirty-two radiomic features, extracted from DCE-MR images of breast lesions after fuzzy c-means segmentation, was collected. There were 944 unique lesions in Database A (208 benign lesions, 736 cancers) and 1986 unique lesions in Database B (481 benign lesions, 1505 cancers). The lesions from each database were divided by year of image acquisition into training and independent test sets, separately by database and in combination. ComBat batch harmonization was conducted on the combined training set to minimize the batch effect on eligible features by database. The empirical Bayes estimates from the feature harmonization were applied to the eligible features of the combined independent test set. The training sets (A, B, and combined) were then used in training linear discriminant analysis classifiers after stepwise feature selection. The classifiers were then run on the A, B, and combined independent test sets. Classification performance was compared using pre-harmonization features to post-harmonization features, including their corresponding feature selection, evaluated using the area under the receiver operating characteristic curve (AUC) as the figure of merit. Four out of five training and independent test scenarios demonstrated statistically equivalent classification performance when compared pre- and post-harmonization. These results demonstrate that translation of machine learning techniques with batch data harmonization can potentially yield generalizable models that maintain classification performance.

Highlights

  • For a given medical imaging protocol, differences in the resulting medical images and the values extracted from them can arise when they are collected in different contexts

  • The concept of harmonization can have several different levels of application in medical imaging, such as at image acquisition, in which protocols are controlled to be implemented the same way in different contexts, or in the post-processing of acquired images to normalize them between two sources of data

  • The Matthews correlation coefficient (MCC) [34], a variation on the Pearson correlation coefficient, served as the figure of merit in the study, and a given batch harmonization method was determined to be better than no batch effect removal method if the difference in the MCC was greater than a pre-determined threshold

Read more

Summary

Introduction

For a given medical imaging protocol, differences in the resulting medical images and the values extracted from them can arise when they are collected in different contexts. Many artificial intelligence/computer-aided diagnosis (AI/CADx) models for diagnosis and prognosis of disease make use of medical images that are acquired within a single institution. This can potentially reduce some differences in factors, there is substantial interest in combining datasets to form potentially more generalizable models through the use of images from multiple institutions. The concept of harmonization can have several different levels of application in medical imaging, such as at image acquisition, in which protocols are controlled to be implemented the same way in different contexts, or in the post-processing of acquired images to normalize them between two sources of data

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call