Abstract
In many medical and pharmaceutical processes, continuous hygiene monitoring relies on manual detection of microorganisms in agar dishes by skilled personnel. While deep learning offers the potential for automating this task, it often faces limitations due to insufficient training data, a common issue in colony detection. To address this, we propose a simple yet efficient SAM-based pipeline for Copy-Paste data augmentation to enhance detection performance, even with limited data. This paper explores a method where annotated microbial colonies from real images were copied and pasted into empty agar dish images to create new synthetic samples. These new samples inherited the annotations of the colonies inserted into them so that no further labeling was required. The resulting synthetic datasets were used to train a YOLOv8 detection model, which was then fine-tuned on just 10 to 1000 real images. The best fine-tuned model, trained on only 1000 real images, achieved an mAP of 60.6, while a base model trained on 5241 real images achieved 64.9. Although far fewer real images were used, the fine-tuned model performed comparably well, demonstrating the effectiveness of the SAM-based Copy-Paste augmentation. This approach matches or even exceeds the performance of the current state of the art in synthetic data generation in colony detection and can be expanded to include more microbial species and agar dishes.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have