ObjectiveMedical imaging plays a central role in medicine today by supporting diagnosis and treatment. For small medical image datasets, training from scratch is a time-consuming process and transfer learning emerges as a solution. In these cases, ImageNet weights are usually used as initial weights, after which fine-tuning is performed. We propose a methodology for COVID-19 severity classification (mild, moderate, severe, critical) in chest X-ray images based on transfer learning using DenseNet121 architecture as a base model. MethodsThe novelty of the approach mainly lies in the investigation of three different weight initialization schemes (i) ImageNet (ii) CheXNeXt (iii) DeepCOVID-XR, which are similar in a graded manner in terms of image nature, with ImageNet being the least similar, CheXNeXt similar as they were trained on different lung diseases not including COVID-19 and DeepCOVID-XR the most similar as they were obtained by training specifically on COVID-19 X-ray images. ResultsThe results show that the worst results are achieved using ImageNet as initial weights (average AUC = 0.700), followed by the better results with DeepCOVID-XR (average AUC = 0.774) while CheXNeXt performed significantly better (average AUC = 0.917). The results of weaker performing classification models were improved when the severe and critical classes were merged, to account for the similarity between these classes in the dataset (average AUC for initialization schemes were ImageNet AUC = 0.867, CheXNeXt AUC = 0.900 and DeepCOVID-XR AUC = 0.794). ConclusionThis leads to a conclusion that, in the medical domain, where image datasets are usually small and highly imbalanced, if initial weights are chosen to be in nature similar to the new dataset, achieved results are better. However, there is no need to start from weights obtained during training on the same disease, as this may cause overfitting. There are significant discrepancies between ImageNet and medical imaging datasets and the results from this paper could help in guiding future implementations of transfer learning in medical applications.