Abstract

Estimation of genetically related individuals is playing an increasingly important role in the ancient DNA field. In recent years, the numbers of sequenced individuals from single sites have been increasing, reflecting a growing interest in understanding the familial and social organisation of ancient populations. Although a few different methods have been specifically developed for ancient DNA, namely to tackle issues such as low-coverage homozygous data, they require a 0.1–1× minimum average genomic coverage per analysed pair of individuals. Here we present an updated version of a method that enables estimates of 1st and 2nd-degrees of relatedness with as little as 0.026× average coverage, or around 18,000 SNPs from 1.3 million aligned reads per sample with average length of 62 bp—four times less data than 0.1× coverage at similar read lengths. By using simulated data to estimate false positive error rates, we further show that a threshold even as low as 0.012×, or around 4000 SNPs from 600,000 reads, will always show 1st-degree relationships as related. Lastly, by applying this method to published data, we are able to identify previously undocumented relationships using individuals that had been excluded from prior kinship analysis due to their very low coverage. This methodological improvement has the potential to enable relatedness estimation on ancient whole genome shotgun data during routine low-coverage screening, and therefore improve project management when decisions need to be made on which individuals are to be further sequenced.

Highlights

  • The estimation of genetic relatives in ancient DNA research has become an integral part of any study involving individuals from the same site or region

  • Individual aDNA projects would be able to better structure their workflow and budget if kinship relationships could be estimated at early stages of the research plan—such as during ultra-low-coverage screening, which is a cheap and effective way to evaluate the quantity and quality of data that can be expected from an ancient individual

  • We present TKGWV2 (“Thomas Kent Genome-Wide Variants 2”), an update to a method published in ­20174 that, by using genome-wide variants instead of variant sets commonly used in aDNA research, such as the 1240K Capture or Affymetrix Human Origins arrays, increases the amount of potentially available data for the method’s relatedness estimator from 1,240,000 or 600,000 to over 22,000,000 non-fixed biallelic variants present in the 1000 Genomes Project Phase 3­ 10—a 18–37 times gain, respectively

Read more

Summary

Introduction

The estimation of genetic relatives in ancient DNA (aDNA) research has become an integral part of any study involving individuals from the same site or region.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call