Abstract

Genome-wide association studies have identified risk loci for SLE, but a large proportion of the genetic contribution to SLE still remains unexplained. To detect novel risk genes, and to predict an individual’s SLE risk we designed a random forest classifier using SNP genotype data generated on the “Immunochip” from 1,160 patients with SLE and 2,711 controls. Using gene importance scores defined by the random forest classifier, we identified 15 potential novel risk genes for SLE. Of them 12 are associated with other autoimmune diseases than SLE, whereas three genes (ZNF804A, CDK1, and MANF) have not previously been associated with autoimmunity. Random forest classification also allowed prediction of patients at risk for lupus nephritis with an area under the curve of 0.94. By allele-specific gene expression analysis we detected cis-regulatory SNPs that affect the expression levels of six of the top 40 genes designed by the random forest analysis, indicating a regulatory role for the identified risk variants. The 40 top genes from the prediction were overrepresented for differential expression in B and T cells according to RNA-sequencing of samples from five healthy donors, with more frequent over-expression in B cells compared to T cells.

Highlights

  • Genome-wide association studies have identified risk loci for Systemic lupus erythematosus (SLE), but a large proportion of the genetic contribution to SLE still remains unexplained

  • We used machine learning based on random forests to design a single nucleotide polymorphisms (SNPs) genotype classifier to discern between patients with SLE and healthy individuals

  • The random forest classifier yields a probability that a sample originates from a patient with SLE for each individual

Read more

Summary

Introduction

Genome-wide association studies have identified risk loci for SLE, but a large proportion of the genetic contribution to SLE still remains unexplained. To detect novel risk genes, and to predict an individual’s SLE risk we designed a random forest classifier using SNP genotype data generated on the “Immunochip” from 1,160 patients with SLE and 2,711 controls. Genome-wide association studies (GWAS) have identified over 60 genetic loci that confer risk for SLE2, but a large proportion of the genetic contribution to SLE susceptibility still remains unknown. The genetic background of specific manifestations of SLE is less well known than that of SLE in general, several single nucleotide polymorphisms (SNPs) have been associated with the subgroup of patients with lupus nephritis[3, 4]. Risk of 18 common diseases using a logistic regression model evaluated on SNP allele frequency and odds ratios[9], but SLE was not included in this study

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call