Abstract

Although there are several options for improving the generalizability of learned models, a data instance-based approach is desirable when stable data acquisition conditions cannot be guaranteed. Despite the wide use of data transformation methods to reduce data discrepancies between different data domains, detailed analysis for explaining the performance of data transformation methods is lacking. This study compares several data transformation methods in the tuberculosis detection task with multi-institutional chest x-ray (CXR) data. Five different data transformations, including normalization, standardization with and without lung masking, and multi-frequency-based (MFB) standardization with and without lung masking were implemented. A tuberculosis detection network was trained using a reference dataset, and the data from six other sites were used for the network performance comparison. To analyze data harmonization performance, we extracted radiomic features and calculated the Mahalanobis distance. We visualized the features with a dimensionality reduction technique. Through similar methods, deep features of the trained networks were also analyzed to examine the models' responses to the data from various sites. From various numerical assessments, the MFB standardization with lung masking provided the highest network performance for the non-reference datasets. From the radiomic and deep feature analyses, the features of the multi-site CXRs after MFB with lung masking were found to be well homogenized to the reference data, whereas the others showed limited performance. Conventional normalization and standardization showed suboptimal performance in minimizing feature differences among various sites. Our study emphasizes the strengths of MFB standardization with lung masking in terms of network performance and feature homogenization.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call