Deep learning-based harmonization of CT reconstruction kernels towards improved clinical task performance.

Dongyang Du,Wenbing Lv,Jieqin Lv,Xiaohui Chen,Lijun Lu,Hubing Wu,Arman Rahmim

doi:10.1007/s00330-022-09229-w

Abstract

To develop a deep learning-based harmonization framework, assessing whether it can improve performance of radiomics models given different kernels in different clinical tasks and additionally generalize to mitigate the effects of new/unobserved kernels on radiomics features. Patient data with 2 reconstruction kernels and phantom data with 22 reconstruction kernels were included. Eighty-five patients were studied for lymph node metastasis (LNM) prediction, and 164 patients for differential diagnosis between lung cancer (LC) and pulmonary tuberculosis (TB). Two convolutional neural network (CNN) models were developed to convert images (i) from B70f to B30f (CNNa) and (ii) from B30f to B70f (CNNb). Model performance between the two kernels was evaluated using AUC and compared with other well-known harmonization methods. Patient-normalized feature difference (PNFD) was used to identify the incompatible kernels (i.e., kernel with median PNFD > 1) with baseline (B30f/B70f), and measure the ability of the CNN models to convert the non-comparable kernels. For LC versus pulmonary TB diagnosis, AUCs of CNNa vs. others were 0.85 vs. 0.54-0.74 (p = 0.0001-0.0003), and for CNNb vs. others: 0.87 vs. 0.54-0.86 (p = 0.0001-0.55). For LNM prediction, AUCs of CNNa vs. others were 0.68 vs. 0.56-0.61 (p = 0.10-0.39), and for CNNb vs. others: 0.78 vs. 0.70-0.73 (p = 0.07-0.40). After CNN harmonization, 17 of 20 (85%) of investigated unknown kernels produced comparable radiomics feature values relative to baseline (median PNFD from 1.10-2.31 to 0.23-1.13). The CNN harmonization effectively improved performance of radiomics models between reconstruction kernels in different clinical tasks, and reduced feature differences between unknown kernels vs. baseline. • The soft (B30f) and sharp (B70f) kernels strongly affect radiomics reproducibility and generalizability. • The convolutional neural network (CNN) harmonization methods performed better than location-scale (ComBat and centering-scaling) and matrix factorization harmonization methods (based on singular value decomposition (SVD) and independent component analysis (ICA)) in both clinical tasks. • The CNN harmonization methods improve feature reproducibility not only between specific kernels (B30f and B70f) from the same scanner, but also between unobserved kernels from different scanners of different vendors.

Full Text