BackgroundLarge vision models (LVM) pretrained by large datasets have demonstrated their enormous capacity to understand visual patterns and capture semantic information from images. We proposed a novel method of knowledge domain adaptation with pretrained LVM for a low-cost artificial intelligence (AI) model to quantify the severity of SARS-CoV-2 pneumonia based on frontal chest X-ray (CXR) images. MethodsOur method used the pretrained LVMs as the primary feature extractor and self-supervised contrastive learning for domain adaptation. An encoder with a 2048-dimensional feature vector output was first trained by self-supervised learning for knowledge domain adaptation. Then a multi-layer perceptron (MLP) was trained for the final severity prediction. A dataset with 2599 CXR images was used for model training and evaluation. ResultsThe model based on the pretrained vision transformer (ViT) and self-supervised learning achieved the best performance in cross validation, with mean squared error (MSE) of 23.83 (95 % CI 22.67–25.00) and mean absolute error (MAE) of 3.64 (95 % CI 3.54–3.73). Its prediction correlation has the R2 of 0.81 (95 % CI 0.79–0.82) and Spearman ρ of 0.80 (95 % CI 0.77–0.81), which are comparable to the current state-of-the-art (SOTA) methods trained by much larger CXR datasets. ConclusionThe proposed new method has achieved the SOTA performance to quantify the severity of SARS-CoV-2 pneumonia at a significantly lower cost. The method can be extended to other infectious disease detection or quantification to expedite the application of AI in medical research.
Read full abstract