Ancient DNA provides a unique frame for directly studying human population genetics in time and space. Still, since most of the ancient genomic data is low coverage, analysis is confronted with a low number of SNPs, genotype uncertainties, and reference-bias. Here, we for the first time benchmark the two distinct versions of Glimpse tools on 120 ancient human genomes from Eurasia including those largely from previously under-evaluated regions and compare the performance of genotype imputation with de facto analysis approaches for low coverage genomic data analysis. We further investigate the impact of two distinct reference panels on imputation accuracy for low coverage genomic data. We compute accuracy statistics and perform PCA and f4-statistics to explore the behaviour of genotype imputation on low coverage data regarding (i)two versions of Glimpse, (ii)two reference panels, (iii)four post-imputation filters and coverages, as well as (iv)data type and geographical origin of the samples on the analyses. Our results reveal that even for 0.1X coverage ancient human genomes, genotype imputation using Glimpse-v2 is suitable. Additionally, using the 1000 Genomes merged with Human Genome Diversity Panel improves the accuracy of imputation for the rare variants with low MAF, which might be important not only for ancient genomics but also for modern human genomic studies based on low coverage data and for haplotype-based analysis. Most importantly, we reveal that genotype imputation of low coverage ancient human genomes reduces the genetic affinity of the samples towards human reference genome. Through solving one of the most challenging biases in data analysis, so-called reference bias, genotype imputation using Glimpse v2 is promising for low coverage ancient human genomic data analysis and for rare-variant-based and haplotype-based analysis.
Read full abstract