Abstract

Knowledge of how individuals are related is important in many areas of research, and numerous methods for inferring pairwise relatedness from genetic data have been developed. However, the majority of these methods were not developed for situations where data are limited. Specifically, most methods rely on the availability of population allele frequencies, the relative genomic position of variants and accurate genotype data. But in studies of non‐model organisms or ancient samples, such data are not always available. Motivated by this, we present a new method for pairwise relatedness inference, which requires neither allele frequency information nor information on genomic position. Furthermore, it can be applied not only to accurate genotype data but also to low‐depth sequencing data from which genotypes cannot be accurately called. We evaluate it using data from a range of human populations and show that it can be used to infer close familial relationships with a similar accuracy as a widely used method that relies on population allele frequencies. Additionally, we show that our method is robust to SNP ascertainment and applicable to low‐depth sequencing data generated using different strategies, including resequencing and RADseq, which is important for application to a diverse range of populations and species.

Highlights

  • The ability to infer the familial relationship between a pair of individ‐ uals from genetic data plays a key role in several research fields

  • We showed that, the method provides useful re‐ sults when applied to ~4× sequencing data as well as restriction site‐associated DNA sequencing (RADseq) like subsets of such data

  • The only differences are that the numerator and denominator are flipped and that E, the proportion of sites where both individuals are heterozygous, is included in the denominator in the statistic de‐ fined by Lee but absent in Ratio 0 (R0)

Read more

Summary

| INTRODUCTION

The ability to infer the familial relationship between a pair of individ‐ uals from genetic data plays a key role in several research fields. There is no overlap between the joint expectation ranges of [R1, R0] and [R1, KING‐robust kinship] for the four close rela‐ tionship categories: full‐siblings (FS), half‐siblings/avuncular/grand‐ parent–grandchild (HS), first cousins (C1) and unrelated (UR) and the range of expected values for parent–offspring (PO) only overlaps with those of FS in a single point (Figure 2, for derivations see sup‐ plementary text) This is true regardless of the underlying allele frequency spectrum and holds for any pair of non‐inbred in‐ dividuals from the same homogenous population, making [R1, R0] and [R1, KING‐robust kinship] potentially useful for distinguishing between these relationships. We expect the three statistics to be robust to SNP ascertainment because they are ratios computed from sites that are variable within the two samples and should be unaffected by the number of non‐variable sites and because the (unknown) underlying frequency spectrum should only have a lim‐ ited effect on these ratios

| METHODS AND MATERIALS
| DISCUSSION
| Limitations and applications
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.