Abstract

BackgroundHeritability of a phenotypic or molecular trait measures the proportion of variance that is attributable to genotypic variance. It is an important concept in breeding and genetics. Few methods are available for calculating heritability for traits derived from high-throughput sequencing.ResultsWe propose several statistical models and different methods to compute and test a heritability measure for such data based on linear and generalized linear mixed effects models. We also provide methodology for hypothesis testing and interval estimation. Our analyses show that, among the methods, the negative binomial mixed model (NB-fit), compound Poisson mixed model (CP-fit), and the variance stabilizing transformed linear mixed model (VST) outperform the voom-transformed linear mixed model (voom). NB-fit and VST appear to be more robust than CP-fit for estimating and testing the heritability scores, while NB-fit is the most computationally expensive. CP-fit performed best in terms of the coverage of the confidence intervals. In addition, we applied the methods to both microRNA (miRNA) and messenger RNA (mRNA) sequencing datasets from a recombinant inbred mouse panel. We show that miRNA and mRNA expression can be a highly heritable molecular trait in mouse, and that some top heritable features coincide with expression quantitative trait loci.ConclusionsThe models and methods we investigated in this manuscript is applicable and extendable to sequencing experiments where some biological replicates are available and the environmental variation is properly controlled. The CP-fit approach for assessing heritability was implemented for the first time to our knowledge. All the methods presented, as well as the generation of simulated sequencing data under either negative binomial or compound Poisson mixed models, are provided in the R package HeritSeq.

Highlights

  • Heritability of a phenotypic or molecular trait measures the proportion of variance that is attributable to genotypic variance

  • We propose two Generalized linear mixed models (GLMM) that can directly use the data without a transformation: (i) the compound Poisson mixed model (CPMM), which is a special case of the Tweedie distribution, and can model data using a continuous distribution with a mass at zero; (ii) the negative binomial mixed model (NBMM) which is the most popular choice for modeling high throughput sequencing (HTS) data due to its simplicity and ability to accommodate overdispersion

  • Our work reports the use of the variance partition coefficient to extend the definition of heritability for generalized linear mixed models in the context of sequencing data

Read more

Summary

Introduction

Heritability of a phenotypic or molecular trait measures the proportion of variance that is attributable to genotypic variance It is an important concept in breeding and genetics. The rise of high throughput sequencing (HTS) technology demands the development of new methods as it produces data that are highly non-Gaussian. Such technology has several advantages over microarrays and is favored by most researchers [12]. We propose several statistical models and different methods to estimate heritability for high throughput sequencing data based on linear and generalized linear mixed effects models

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.