Abstract

Liquid chromatography-mass spectrometry (LC-MS) is a widely utilized technique for inspecting adulteration. Unscrupulous businesses persistently introduce novel illegal adulterants, making it necessary to develop methods to screen compounds not present in the current library. Conventional cosine similarity for mass spectral libraries matching is limited in their ability to identify structurally similar compounds. In our previous study, comparison of performance among four advanced similarity algorithms revealed that Spec2Vec exhibited the best performance in terms of both detection capability and false discovery rate, making it the chosen method for identifying illegal adulterants. However, Spec2Vec still exhibited worse performance compared to MS2DeepScore and entropy similarity in the aspects of detection capability and false discovery rate, respectively. In this study, our objective was to optimize the performance of spectral similarity for a specific compound class by fine-tuning a pretrained Spec2Vec model. Additionally, we implemented the chemical classification tool CANOPUS to address the issue of similarities in backbone structures between illegal adulterants and compounds found in herbal medicine, which can lead to false positives. We utilized glucocorticoids as potentially illicit adulterants to provide a proof-of-concept, and the results demonstrated that the fine-tuned Spec2Vec model not only exhibits a significant improvement in detection ability compared to the original model but also achieves comparable performance to MS2Deepscore. Moreover, the fine-tuned Spec2Vec model shows notably fewer false positives in comparison to MS2Deepscore. Overall, this proposed pipeline demonstrates high effectiveness and competitiveness in inspecting illegal adulterants, enhancing the analysis of large-scale MS data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call