Abstract

The problem of inferring the relatedness distribution between two individuals from biallelic marker data is considered. This problem can be cast as an estimation task in a mixture model: at each marker the latent variable is the relatedness state, and the observed variable is the genotype of the two individuals. In this model, only the prior proportions are unknown, and can be obtained via ML estimation using the EM algorithm. When the markers are biallelic and the data unphased, the identifiability of the model is known not to be guaranteed. In this article, model identifiability is investigated in the case of phased data generated from a crossing design, a classical situation in plant genetics. It is shown that identifiability can be guaranteed under some conditions on the crossing design. The adapted ML estimator is implemented in an R package called Relatedness. The performance of the ML estimator is evaluated and compared to that of the benchmark moment estimator, both on simulated and real data. Compared to its competitor, the ML estimator is shown to be more robust and to provide more realistic estimates.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.