Biologists currently have an assortment of high-throughput sequencing techniques allowing the study of population dynamics in increasing detail. The utility of genetic estimates depends on their ability to recover meaningful approximations while filtering out noise produced by artifacts. In this study, we empirically compared the congruence of two reduced representation approaches (genotyping-by-sequencing, GBS, and whole-exome sequencing, WES) in estimating genetic diversity and population structure using SNP markers typed in a small number of wild jaguar (Panthera onca) samples from South America. Due to its targeted nature, WES allowed for a more straightforward reconstruction of loci compared to GBS, facilitating the identification of true polymorphisms across individuals. We therefore used WES-derived metrics as a benchmark against which GBS-derived indicators were compared, adjusting parameters for locus assembly and SNP filtering in the latter. We observed significant variation in SNP call rates across samples in GBS datasets, leading to a recurrent miscalling of heterozygous sites. This issue was further amplified by small sample sizes, ultimately impacting the consistency of summary statistics between genotyping methods. Recognizing that the genetic markers obtained from GBS and WES are intrinsically different due to varying evolutionary pressures, particularly selection, we consider that our empirical comparison offers valuable insights and highlights critical considerations for estimating population genetic attributes using reduced representation datasets. Our results emphasize the critical need for careful evaluation of missing data and stringent filtering to achieve reliable estimates of genetic diversity and differentiation in elusive wildlife species.
Read full abstract