Leukemia is a prevalent and widespread blood disease, and its early diagnosis is crucial for effective patient treatment. Diagnosing leukemia types heavily relies on pathologists’ morphological examination of blood cell images. However, this process is tedious and time-consuming, and the diagnostic results are subjective, leading to potential misdiagnosis and underdiagnosis. This paper proposes a blood cell image classification method that combines MAE with an enhanced Vision Transformer to tackle these challenges. Initially, pre-training occurs on two datasets, TMAMD and Red4, using the MAE self-supervised learning algorithm. Subsequently, the pre-training weights are transferred to our improved model.This paper introduces feature fusion of the outputs from each layer of the Transformer encoder to maximize the utilization of features extracted from lower layers, such as color, contour, and texture of blood cells, along with deeper semantic features. Furthermore, the dynamic margins for the subcenter Arcface Loss function are employed to enhance the model’s fine-grained feature representation by achieving inter-class dispersion and intra-class aggregation. Models trained using our method achieved state-of-the-art results on both the TMAMD dataset and Red4 dataset, with classification accuracies of 93.51% and 81.41%, respectively. This achievement is expected to be a valuable reference for physicians in their clinical diagnoses.
Read full abstract