Abstract

Despite the vast amount of metabolic information that can be captured in untargeted metabolomics, many biological applications are looking for a biology-driven metabolomics platform that targets a set of metabolites that are relevant to the given biological question. Steroids are a class of important molecules that play critical roles in many physiological systems and diseases. Besides known steroids, there are a large number of unknown steroids that have not been reported in the literature. The ability to rapidly detect and quantify both known and unknown steroid molecules in a biological sample can greatly accelerate a broad range of steroid-focused life science research. This work describes the development and application of SteroidXtract, a convolutional neural network (CNN)-based bioinformatics tool that can recognize steroid molecules in mass spectrometry (MS)-based untargeted metabolomics using their unique tandem MS (MS2) spectral patterns. SteroidXtract was trained using a comprehensive set of standard MS2 spectra from MassBank of North America (MoNA) and an in-house steroid library. Data augmentation strategies, including intensity thresholding and Gaussian noise addition, were created and applied to minimize data overfitting caused by the limited number of standard steroid MS2 spectra. The CNN model embedded in SteroidXtract was further compared with random forest and XGBoost using nested cross-validations to demonstrate its performance. Finally, SteroidXtract was applied in several metabolomics studies to demonstrate its sensitivity, specificity, and robustness. Compared to conventional statistics-driven metabolomics data interpretation, our work offers a novel automated biology-driven approach to interpreting untargeted metabolomics data, prioritizing biologically important molecules with high throughput and sensitivity.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.