Abstract

The objective of this study was to detect interactions between relevant single-nucleotide polymorphisms (SNPs) associated with rheumatoid arthritis (RA). Data from Problem 1 of the Genetic Analysis Workshop 16 were used. These data consisted of 868 cases and 1,194 controls genotyped with the 500 k Illumina chip. First, machine learning methods were applied for preselecting SNPs. One hundred SNPs outside the HLA region and 1,500 SNPs in the HLA region were preselected using information-gain theory. The software weka was used to reduce colinearity and redundancy in the HLA region, resulting in a subset of 6 SNPs out of 1,500. In a second step, a parametric approach to account for interactions between SNPs in the HLA region, as well as HLA-nonHLA interactions was conducted using a Bayesian threshold least absolute shrinkage and selection operator (LASSO) model incorporating 2,560 covariates. This approach detected some main and interaction effects for SNPs in genes that have previously been associated with RA (e.g., rs2395175, rs660895, rs10484560, and rs2476601). Further, some other SNPs detected in this study may be considered in candidate gene studies.

Highlights

  • Rheumatoid arthritis (RA) is a chronic disease with known autoimmune pathophysiology

  • One approach for efficiently handling high dimensional GWA data consists of two steps: 1) reducing dimensionality by filtering non-informative markers, and 2) applying a more sophisticated model to quantify the effect of the selected single-nucleotide polymorphisms (SNPs) and their interactions

  • Data Data from the North American Rheumatoid Arthritis Consortium (NARAC) provided by the Genetic Analysis Workshop 16 (GAW16) were used in the analyses

Read more

Summary

Introduction

Rheumatoid arthritis (RA) is a chronic disease with known autoimmune pathophysiology. Information gain and the wrapper procedure are examples of machine learning that have shown benefits over linear regression for Step 1 These methods are easy to implement and may deal with crude, noisy, and inconsistent information. The function that relates covariates to observations is unspecified, providing more flexibility in the model They may deal with non-additive effects, which are of interest in genetic epidemiology. Some drawbacks to these methods exist: information gain may not completely remove colinearity in the system and the wrapper procedure it is too computationally demanding to use on a large number of records and SNPs. a second step is necessary. We propose a novel method that is able to reduce colinearity and make a higher shrinkage to zero than other methods for less relevant SNP and SNP × SNP effects

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call