This study aimed to apply near-infrared (NIR) spectroscopy combined with machine learning techniques to identify yeast strains rapidly and practically, comparing the results with traditional molecular identification methods. Yeasts were isolated from the digestive tracts of aquatic mining insects collected in the extreme north of the Western Amazon (Roraima), Brazil, and preserved through cryopreservation and mineral oil methods. Molecular identification involved PCR amplification and sequencing of ribosomal DNA regions. NIR spectroscopy, coupled with multivariate analysis and machine learning algorithms such as principal component analysis (PCA), hierarchical cluster analysis (HCA), k-nearest neighbor (KNN), and soft independent modeling by class analogy (SIMCA), was used to analyze and classify the yeast samples, accurately identified yeast strains at the genus and species levels, achieving 100% accuracy in both the calibration and validation sets. The results indicate that this method provides a rapid, non-destructive, and environmentally friendly alternative to traditional molecular techniques, making it suitable for real-time, in situ analysis with minimal sample manipulation.
Read full abstract