Abstract

Support vector machine (SVM) is a popular classification method for the analysis of wide range of data including big data. Many SVM methods with feature selection have been developed under frequentist regularization or Bayesian shrinkage frameworks. On the other hand, the importance of incorporating a priori known biological knowledge, such as gene pathway information which stems from the gene regulatory network, into the statistical analysis of genomic data has been recognized in recent years. In this article, we propose a new Bayesian SVM approach that enables the feature selection to be guided by the knowledge on the graphical structure among predictors. The proposed method uses the spike-and-slab prior for feature selection, combined with the Ising prior that encourages group-wise selection of the predictors adjacent to each other on the known graph. Gibbs sampling algorithm is used for Bayesian inference. The performance of our method is evaluated and compared with existing SVM methods in terms of prediction and feature selection in extensive simulation settings. In addition, our method is illustrated in the analysis of genomic data from a cancer study, demonstrating its advantage in generating biologically meaningful results and identifying potentially important features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call