Abstract
BackgroundQuantitative trait locus (QTL) mapping in genetic data often involves analysis of correlated observations, which need to be accounted for to avoid false association signals. This is commonly performed by modeling such correlations as random effects in linear mixed models (LMMs). The R package lme4 is a well-established tool that implements major LMM features using sparse matrix methods; however, it is not fully adapted for QTL mapping association and linkage studies. In particular, two LMM features are lacking in the base version of lme4: the definition of random effects by custom covariance matrices; and parameter constraints, which are essential in advanced QTL models. Apart from applications in linkage studies of related individuals, such functionalities are of high interest for association studies in situations where multiple covariance matrices need to be modeled, a scenario not covered by many genome-wide association study (GWAS) software.ResultsTo address the aforementioned limitations, we developed a new R package lme4qtl as an extension of lme4. First, lme4qtl contributes new models for genetic studies within a single tool integrated with lme4 and its companion packages. Second, lme4qtl offers a flexible framework for scenarios with multiple levels of relatedness and becomes efficient when covariance matrices are sparse. We showed the value of our package using real family-based data in the Genetic Analysis of Idiopathic Thrombophilia 2 (GAIT2) project.ConclusionsOur software lme4qtl enables QTL mapping models with a versatile structure of random effects and efficient computation for sparse covariances. lme4qtl is available at https://github.com/variani/lme4qtl.
Highlights
Quantitative trait locus (QTL) mapping in genetic data often involves analysis of correlated observations, which need to be accounted for to avoid false association signals
The standard statistical approach used in quantitative trait locus (QTL) mapping is linear mixed models (LMMs), which is able to effectively assess and estimate the contribution of an individual genetic locus in the presence of correlated observations [1,2,3,4]
We have developed a new lme4qtl R package that unlocks the well-established lme4 framework for QTL mapping analysis
Summary
We considered three models for the analysis of APTT in the GAIT2 data, namely polygenic, SNP-based association and gene-environment interaction. The GWAS computation time of the association analysis with two random effects by lme4qtl was 7.6 h. We implemented different restrictions on model parameters in lme4qtl by means of a special syntax for the vcControl parameter, as described in Additional file 1: Supplementary Note 2. Numerical results of the likelihood ratio tests in Additional file 1: Supplementary Note 3 showed that the evidence for gene-environment interaction is weak. We used the polygenic model m1 as an initial model (the random effect (1|hhid) was omitted), where the genetic relationship matrix mat has a high proportion of zero values (sparsity) equal to 0.98. We found that the time required to fit the polygenic model increased substantially: it became an order of magnitude greater once the sparsity changed from the GAIT2 level 0.98 to 0.60 (Additional file 1: Figure S4)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.