Abstract
BackgroundThe distribution of residual effects in linear mixed models in animal breeding applications is typically assumed normal, which makes inferences vulnerable to outlier observations. In order to mute the impact of outliers, one option is to fit models with residuals having a heavy-tailed distribution. Here, a Student's-t model was considered for the distribution of the residuals with the degrees of freedom treated as unknown. Bayesian inference was used to investigate a bivariate Student's-t (BSt) model using Markov chain Monte Carlo methods in a simulation study and analysing field data for gestation length and birth weight permitted to study the practical implications of fitting heavy-tailed distributions for residuals in linear mixed models.MethodsIn the simulation study, bivariate residuals were generated using Student's-t distribution with 4 or 12 degrees of freedom, or a normal distribution. Sire models with bivariate Student's-t or normal residuals were fitted to each simulated dataset using a hierarchical Bayesian approach. For the field data, consisting of gestation length and birth weight records on 7,883 Italian Piemontese cattle, a sire-maternal grandsire model including fixed effects of sex-age of dam and uncorrelated random herd-year-season effects were fitted using a hierarchical Bayesian approach. Residuals were defined to follow bivariate normal or Student's-t distributions with unknown degrees of freedom.ResultsPosterior mean estimates of degrees of freedom parameters seemed to be accurate and unbiased in the simulation study. Estimates of sire and herd variances were similar, if not identical, across fitted models. In the field data, there was strong support based on predictive log-likelihood values for the Student's-t error model. Most of the posterior density for degrees of freedom was below 4. Posterior means of direct and maternal heritabilities for birth weight were smaller in the Student's-t model than those in the normal model. Re-rankings of sires were observed between heavy-tailed and normal models.ConclusionsReliable estimates of degrees of freedom were obtained in all simulated heavy-tailed and normal datasets. The predictive log-likelihood was able to distinguish the correct model among the models fitted to heavy-tailed datasets. There was no disadvantage of fitting a heavy-tailed model when the true model was normal. Predictive log-likelihood values indicated that heavy-tailed models with low degrees of freedom values fitted gestation length and birth weight data better than a model with normally distributed residuals.Heavy-tailed and normal models resulted in different estimates of direct and maternal heritabilities, and different sire rankings. Heavy-tailed models may be more appropriate for reliable estimation of genetic parameters from field data.
Highlights
The distribution of residual effects in linear mixed models in animal breeding applications is typically assumed normal, which makes inferences vulnerable to outlier observations
Simulation study The predictive log-likelihood values in Table 2 were computed for bivariate Student’s-t (BSt) and BN models fitted to the simulated heavy-tailed and normal datasets
There was no difference in predictive log-likelihoods (PLL) between BSt and BN-∞ models and there were no disadvantages of fitting a BSt model when the true model was normal
Summary
The distribution of residual effects in linear mixed models in animal breeding applications is typically assumed normal, which makes inferences vulnerable to outlier observations. Von Rohr and Hoeschele [5] have demonstrated the application of a Student’s-t sampling model under four different error distributions in statistical mapping of quantitative trait loci (QTL) They have determined that additive and dominance QTL and residual variance estimates are much closer to the simulated true values when the data itself is heavy-tailed and the analysis is performed with the skewed Student’s-t model rather than with a normal model. The objectives of this research were to 1) examine by simulation if Bayesian inference under a bivariate Student’s-t distribution of residuals can accommodate models with either light-tailed or heavy-tailed residuals, and 2) investigate the practical implications of fitting a Student’s-t distribution with unknown degrees of freedom for the residuals in bivariate field data In both cases, results were compared to those from the conventional approach of assuming bivariate normal (BN) residuals
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.