Abstract
Raman spectroscopy, combined with machine learning techniques, holds great promise for many applications as a rapid, sensitive, and label-free identification method. Such approaches perform well when classifying spectra of chemical species that were encountered during the training phase. That is, species that are known to the neural network. However, in real-world settings, such as in clinical applications, there will always be substances whose spectra have not yet been taken. When the neural network encounters these new species during the testing phase, the number of false positives becomes uncontrollable, limiting the usefulness of these techniques, especially in public safety applications. To overcome these barriers, we implemented the recently introduced Entropic Open Set and Objectosphere loss functions. To demonstrate the efficacy and efficiency of this approach, we compiled a database of hyperspectral Raman images of 40 chemical species separating them into three class categorizations. The known class consisted of 20 biologically relevant species comprising amino acids, the ignored class was 10 "irrelevant" species comprising bio-related chemicals, and the never seen before class was 10 various chemical species that the neural network had not seen before. We show that this approach not only enables the network to effectively separate the unknown species while preserving high accuracy on the known ones and reducing false positives but also performs better than the current gold standards in machine learning techniques. This opens the door to using Raman spectroscopy, combined with our novel machine learning algorithm, in a variety of practical applications. Availability and implementation: freely available on the web at https://github.com/BalytskyiJaroslaw/RamanOpenSet.git.
Accepted Version
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have