In clinical drug sensitivity experiments, it is necessary to plate culture pathogenic bacteria and pick suitable colonies for bacterial solution preparation, which is a process that is currently carried out completely by hand. Moreover, the problems of plate contamination, a long culture period, and large image annotation in colony plate image acquisition can lead to a small amount of usable data. To address the issues mentioned above, we adopt a deep learning approach and conduct experiments on the AGAR dataset. We propose to use style transfer to extend the trainable dataset and successfully obtain 4k microbial colony images using this method. In addition, we introduce the Swin Transformer as a feature extraction network in the Cascade Mask R-CNN model architecture to better extract the feature information of the images. After our experimental comparison, the model achieves a mean Average Precision (mAP) of 61.4% at the Intersection over Union (IoU) [0.50:0.95]. This performance surpasses that of the Cascade R-CNN with HRNet, which is the top-performing model in experiments conducted on the AGAR dataset, by a margin of 2.2%. Furthermore, we perform experiments using YOLOv8x on the AGAR dataset, which results in a mAP of 76.7%.
Read full abstract