Abstract

Although a number of treatments are available for rheumatoid arthritis (RA), each of them shows a significant nonresponse rate in patients. Therefore, predicting a priori the likelihood of treatment response would be of great patient benefit. Here, we conducted a comparison of a variety of statistical methods for predicting three measures of treatment response, between baseline and 3 or 6 months, using genome‐wide SNP data from RA patients available from the MAximising Therapeutic Utility in Rheumatoid Arthritis (MATURA) consortium. Two different treatments and 11 different statistical methods were evaluated. We used 10‐fold cross validation to assess predictive performance, with nested 10‐fold cross validation used to tune the model hyperparameters when required. Overall, we found that SNPs added very little prediction information to that obtained using clinical characteristics only, such as baseline trait value. This observation can be explained by the lack of strong genetic effects and the relatively small sample sizes available; in analysis of simulated and real data, with larger effects and/or larger sample sizes, prediction performance was much improved. Overall, methods that were consistent with the genetic architecture of the trait were able to achieve better predictive ability than methods that were not. For treatment response in RA, methods that assumed a complex underlying genetic architecture achieved slightly better prediction performance than methods that assumed a simplified genetic architecture.

Highlights

  • Rheumatoid arthritis (RA) is an autoimmune disease that results in chronic joint inflammation (McInnes & Schett, 2007)

  • We compare the prediction ability based on a relatively small data set of 11 methods capable of handling cases where number of SNPs exceeds the number of individuals: lasso, ridge, elastic net, random forests (RF), support vector regression (SVR), sparse partial least squares (SPLS), genome‐ wide complex trait analysis (GCTA‐GREML), a Bayesian sparse linear mixed model (BSLMM), a neural network (SkyNet), polygenic risk scores (PRSice), and linkage disequilibrium (LD)‐based polygenic risk scores (LDpred)

  • The biology of PBC does not relate to the biology of rheumatoid arthritis (RA), we considered this data set to provide an illustrative example of prediction in a real data set that lacks the handicaps of the MAximising Therapeutic Utility in Rheumatoid Arthritis (MATURA) data set

Read more

Summary

| INTRODUCTION

Rheumatoid arthritis (RA) is an autoimmune disease that results in chronic joint inflammation (McInnes & Schett, 2007). Individual components of the DAS28 such as CRP, ESR, and SJC28 have been found to be associated with imaging‐detected synovitis (Baker et al, 2014; Hensor et al, 2018) suggesting that these markers are the most relevant measures for treatment response Following this recommendation, we considered change in CRP, SJC28, and ESR as three different measures of treatment response in the MATURA data sets (with ESR available for the MTX cohort only). Individual and SNP QC (as described above for the anti‐TNF data set) resulted in a data set with 657 patients and 6,291,430 SNPs. For analysing the change in CRP, the phenotype was defined as log(CRPfu + 1) − log(CRPbl + 1), and was adjusted for log(CRPbl + 1), the cohort effect and the first 10 PCs.

| METHODS
Method Lasso
| RESULTS
Method
| DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call