The aim of this research is to create an automated system for identifying soil microorganisms at the genera level based on raw microscopic images of monocultural colonies grown in laboratory environment. The examined genera are: Fusarium, Trichoderma, Verticillium, Purpureolicillium and Phytophthora. The proposed pipeline deals with unprocessed microscopic images, avoiding additional sample marking or coloration. The methodology includes several stages: image preprocessing, segmenting images to isolate microorganisms from the background, calculating features related to image color and texture for classification. Using an extensive dataset of 2866 images from the National Institute of Horticultural Research in Skierniewice the Extreme Learning Machine model was trained and validated. The model showcases high accuracy and computational efficiency compared to other Machine Learning state-of-the art methods e.g. CatBoost, Random Forest or Convolutional Neural Networks. Statistical techniques, including Multivariate Analysis of Variance were employed to confirm significant differences among the datasets, enhancing the model’s robustness. Nevertheless, Shapley Additive Explanations values provided transparency into the model’s decision-making process. This approach has the potential to improve early detection and management of soil pathogens, promoting sustainable agriculture and demonstrating machine learning’s potential in environmental monitoring, microbial ecology or industrial microbiology.
Read full abstract