Abstract

A major controversy in psychiatric genetics is whether nonadditive genetic interaction effects contribute to the risk of highly polygenic disorders. We applied a support vector machines (SVMs) approach, which is capable of building linear and nonlinear models using kernel methods, to classify cases from controls in a large schizophrenia case–control sample of 11,853 subjects (5,554 cases and 6,299 controls) and compared its prediction accuracy with the polygenic risk score (PRS) approach. We also investigated whether SVMs are a suitable approach to detecting nonlinear genetic effects, that is, interactions. We found that PRS provided more accurate case/control classification than either linear or nonlinear SVMs, and give a tentative explanation why PRS outperforms both multivariate regression and linear kernel SVMs. In addition, we observe that nonlinear kernel SVMs showed higher classification accuracy than linear SVMs when a large number of SNPs are entered into the model. We conclude that SVMs are a potential tool for assessing the presence of interactions, prior to searching for them explicitly.

Highlights

  • Schizophrenia has a complex, polygenic architecture in which a large number of genetic variants spanning a wide spectrum of population frequencies contribute to disease risk (International Schizophrenia Consortium et al, 2009; Lee et al, 2012)

  • The aim of this study was to examine the application of linear and nonlinear support vector machine learning (SVM) to identifying the presence of genetic interactions which may contribute to the risk of schizophrenia, and to compare their prediction accuracy with polygenic risk score approach

  • The results from genome-wide significant single nucleotide polymorphisms (SNPs) indicate that since radial basis function kernel (RBF) SVM models did not improve the prediction accuracy over the SVM-Linear model. This implies that it is unlikely that there are SNP × SNP interactions among this set of SNPs with sufficiently large interaction effect sizes to be detected with the current sample size

Read more

Summary

| INTRODUCTION

Schizophrenia has a complex, polygenic architecture in which a large number of genetic variants spanning a wide spectrum of population frequencies contribute to disease risk (International Schizophrenia Consortium et al, 2009; Lee et al, 2012). The standard approach to genome-wide association study (GWAS) data assumes an additive model, which, in statistical terms, is equivalent to looking for the main effects of variants contributing to disease risk. We employ SVM algorithms which offer both linear and nonlinear modeling options and may account for pairwise and higher order SNP interactions, to explore a large schizophrenia case–control sample of 11,853 subjects (5,554 cases and 6,299 controls) for classification of SZ cases and controls. In a direct comparison of the results of SVM analyses with those derived from a standard additive model, we first analyzed the 125 GWS autosomal SNPs from PGC2 (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014) in our sample. We explored a set of top ~5,000 independent SNPs most associated with schizophrenia in the PGC2 study (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014), and tested those for the presence of potential interactions

| METHODS
| RESULTS
Findings
| DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call