Abstract

Surface-enhanced Raman spectroscopy (SERS) has wide diagnostic applications due to narrow spectral features that allow multiplex analysis. We have previously developed a multiplexed, SERS-based nanosensor for micro-RNA (miRNA) detection called the inverse molecular sentinel (iMS). Machine learning (ML) algorithms have been increasingly adopted for spectral analysis due to their ability to discover underlying patterns and relationships within large and complex data sets. However, the high dimensionality of SERS data poses a challenge for traditional ML techniques, which can be prone to overfitting and poor generalization. Non-negative matrix factorization (NMF) reduces the dimensionality of SERS data while preserving information content. In this paper, we compared the performance of ML methods including convolutional neural network (CNN), support vector regression, and extreme gradient boosting combined with and without NMF for spectral unmixing of four-way multiplexed SERS spectra from iMS assays used for miRNA detection. CNN achieved high accuracy in spectral unmixing. Incorporating NMF before CNN drastically decreased memory and training demands without sacrificing model performance on SERS spectral unmixing. Additionally, models were interpreted using gradient class activation maps and partial dependency plots to understand predictions. These models were used to analyze clinical SERS data from single-plexed iMS in RNA extracted from 17 endoscopic tissue biopsies. CNN and CNN-NMF, trained on multiplexed data, performed most accurately with RMSElabel = 0.101 and 9.68 × 10-2, respectively. We demonstrated that CNN-based ML shows great promise in spectral unmixing of multiplexed SERS spectra, and the effect of dimensionality reduction on performance and training speed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call