Abstract
Since their introduction in the 50’s, variance component mixed models have been widely used in many application fields. In this context, ReML estimation is by far the most popular procedure to infer the variance components of the model. Although many implementations of the ReML procedure are readily available, there is still need for computational improvements due to the ever-increasing size of the datasets to be handled, and to the complexity of the models to be adjusted. In this paper, we present a Min-Max (MM) algorithm for ReML inference and combine it with several speed-up procedures. The ReML MM algorithm we present is compared to 5 state-of-the-art publicly available algorithms used in statistical genetics. The computational performance of the different algorithms are evaluated on several datasets representing different plant breeding experimental designs. The MM algorithm ranks among the top 2 methods in almost all settings and is more versatile than many of its competitors. The MM algorithm is a promising alternative to the classical AI-ReML algorithm in the context of variance component mixed models. It is available in the MM4LMM R-package.
Highlights
Since their formal introduction in the early 50’s [1, 2], mixed models have become an indispensable tool of modern statistics
The Restricted Maximum Likelihood (ReML) MM algorithm we present is compared to 5 state-of-the-art publicly available algorithms used in statistical genetics
We provide a thorough benchmark comparison to illustrate that the proposed MM algorithm can compete with several state-of-the-art algorithms that are widely used in quantitative genetics to perform ReML estimation, including BOLT-LMM [17], FaST-LMM [7], gaston [9], GEMMA [18] and GridLMM [19]
Summary
Since their formal introduction in the early 50’s [1, 2], mixed models have become an indispensable tool of modern statistics. With the development of high throughput technologies, a special care has been dedicated to the development of efficient algorithmic procedures for the inference of mixed models [6,7,8,9] This is illustrated by the availability of many tools/softwares that either perform inference in a mixed model including many (fixed and random) effects on large datasets, or alternatively that efficiently fit hundreds of thousands of mixed models with a limited number of variance components. Many of these tools were developed in the context of animal or human quantitative genetics, or single environment plant studies, where a popular strategy for GWAS analysis is the one presented in [10] that requires the fitting of a simple 2-component mixed model per marker, a favorable case for which efficient algorithms exist that allow the analysis of very large panels with − individuals genotyped at markers [7, 8]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.