Combining Sparse Group Lasso and Linear Mixed Model Improves Power to Detect Genetic Variants Underlying Quantitative Traits.

Yingjie Guo,Alon Keinan,Xiaoyan Liu,Quan Zou,Chenxi Wu,Maozu Guo

doi:10.3389/fgene.2019.00271

Abstract

Genome-Wide association studies (GWAS), based on testing one single nucleotide polymorphism (SNP) at a time, have revolutionized our understanding of the genetics of complex traits. In GWAS, there is a need to consider confounding effects such as due to population structure, and take groups of SNPs into account simultaneously due to the “polygenic” attribute of complex quantitative traits. In this paper, we propose a new approach SGL-LMM that puts together sparse group lasso (SGL) and linear mixed model (LMM) for multivariate associations of quantitative traits. LMM, as has been often used in GWAS, controls for confounders, while SGL maintains sparsity of the underlying multivariate regression model. SGL-LMM first sets a fixed zero effect to learn the parameters of random effects using LMM, and then estimates fixed effects using SGL regularization. We present efficient algorithms for hyperparameter tuning and feature selection using stability selection. While controlling for confounders and constraining for sparse solutions, SGL-LMM also provides a natural framework for incorporating prior biological information into the group structure underlying the model. Results based on both simulated and real data show SGL-LMM outperforms previous approaches in terms of power to detect associations and accuracy of quantitative trait prediction.

Highlights

Quantitative traits are important in medicine, agriculture, and evolution but, until recently, few polymorphisms have been shown to be related in these traits
Experiments on semi-empirical data showed that the combination of sparse group lasso and a linear mixed model yielded better power to identify marker associations in a large range of settings, and application to real datasets have verified that SGL-LMM generated a sparse solution with accurate prediction of phenotypes and interpretable detection of marker associations
We modeled the phenotype as a sum of three terms: a fixed effect determined by the association single nucleotide polymorphism (SNP), a random confounding effect due to population structure, and an i.i.d. noise as follows: y = Xβ + ypop + φ where y is a vector of observed phenotypes of size m × 1 for m samples, X is a m × q matrix that consists of SNPs and other variables of the m samples, ypop is a m × 1 random matrix with distribution N (0, σg2K) where K

Summary

INTRODUCTION

Quantitative traits are important in medicine, agriculture, and evolution but, until recently, few polymorphisms have been shown to be related in these traits. The SGL has a L2 penalty that promotes the selection of only a subset of the groups and L1 penalty that promotes the selection of only a subset of the predictors within a group Another important factor in genetic association studies is the existence of confounding, which are indirect associations between markers and traits due to factors like population structure, family structure, and cryptic relatedness. Experiments on semi-empirical data showed that the combination of sparse group lasso and a linear mixed model yielded better power to identify marker associations in a large range of settings, and application to real datasets have verified that SGL-LMM generated a sparse solution with accurate prediction of phenotypes and interpretable detection of marker associations

Method

Sparse Group Lasso

Phenotype Prediction

Model Selection

Application With Arabidopsis thaliana Data

Existing Methods

Performance Measurements

Alternative Methods

Application With Arabidopsis thaliana

DISCUSSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Genetics	Publication Date: Apr 10, 2019
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Combining Sparse Group Lasso and Linear Mixed Model Improves Power to Detect Genetic Variants Underlying Quantitative Traits.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics

Lead the way for us

Similar Papers

Author response: Genetic architecture of natural variation of cardiac performance from flies to humans
Saswati Saha ... Anaïs Kervadec
-
Saswati Saha, et. al.Saswati Saha ... Anaïs Kervadec
11 Oct 2022
11 Oct 2022

Decision letter: Genetic architecture of natural variation of cardiac performance from flies to humans
Detlef Weigel
-
Detlef WeigelDetlef Weigel
29 Sep 2022
29 Sep 2022

Editor's evaluation: Genetic architecture of natural variation of cardiac performance from flies to humans
Detlef Weigel
-
Detlef WeigelDetlef Weigel
29 Sep 2022
29 Sep 2022

Single nucleotide polymorphisms (SNPs) involved in insulin resistance, weight regulation, lipid metabolism and inflammation in relation to metabolic syndrome: an epidemiological study
Cécile M Povel ... Yvonne T Van Der Schouw
Cardiovascular diabetology | VOL. 11
Cécile M Povel, et. al.Cécile M Povel ... Yvonne T Van Der Schouw
01 Jan 2012
Cardiovascular diabetology | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Combining Sparse Group Lasso and Linear Mixed Model Improves Power to Detect Genetic Variants Underlying Quantitative Traits.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics