Abstract

Peak detection of untargeted liquid chromatography-high resolution mass spectrometry (LC-HRMS) data is a key step to identify the metabolic status of the drugable chemicals and extracts from functional foods or herbs. Nevertheless, the existing approaches are difficult to obtain ideal results with low false positives and false negatives. In this paper, we proposed an automatic method based on convolutional neural network (CNN) for image classification and Faster R–CNN for peak location/classification in untargeted LC-HRMS data, and named it Peak_CF. It can achieve detection of target peaks with high accuracy and high recall (both >90%) as verified by an evaluation data-set. In terms of detecting the m/z peaks of known compounds, Peak_CF is better than Peakonly, and it can effectively have an overall peak shape judgment of split peaks. For the same evaluation data, the recall of MZmine2 (ADAP) is slightly higher than that of Peak_CF, however, the F1 score of Peak_CF is higher, indicating that it has higher accuracy. In addition, the Peak_ CF training model with strong generalization ability can be achieved and verified. At last, Peak_CF was applied in real metabolic fingerprints of total flavonoids from Glycyrrhiza uralensis Fisch, also a contrast was conducted based on 40 m/z peaks of 40 prototypes in serum data-set. The result showed that the recall rate of Peak_CF and Peakonly all reached 95%, higher than 70% of MZmine2 (ADAP), and Peak_CF is more accurate when detecting EIC that has serious drifts. In conclusion, Peak_CF provides a new route for data mining of LC-HRMS datasets of drug (or herbs, or functional foods) metabolites.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call