Abstract
Identifying biomarkers that are indicative of a phenotypic state is difficult because of the amount of natural variability which exists in any population. While there are many different algorithms to select biomarkers, previous investigation shows the sensitivity and flexibility of support vector machines (SVM) make them an attractive candidate. Here we evaluate the ability of support vector machine recursive feature elimination (SVM-RFE) to identify potential metabolic biomarkers in liquid chromatography mass spectrometry untargeted metabolite datasets. Two separate experiments are considered, a low variance (low biological noise) prokaryotic stress experiment, and a high variance (high biological noise) mammalian stress experiment. For each experiment, the phenotypic response to stress is metabolically characterized. SVM-based classification and metabolite ranking is undertaken using a systematically reduced number of biological replicates to evaluate the impact of sample size on biomarker reproducibility and robustness. Our results indicate the highest ranked 1 % of metabolites, the most predictive of the physiological state, were identified by SVM-RFE even when the number of training examples was small (≥3) and the coefficient of variation was high (>0.5). An accuracy analysis shows filtering with recursive feature elimination measurably improves SVM classification accuracy, an effect that is pronounced when the number of training examples is small. These results indicate that SVM-RFE can be successful at biomarker identification even in challenging scenarios where the training examples are noisy and the number of biological replicates is low.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.