Abstract

Studies in a variety of species have shown evidence for positively selected variants introduced into a population via introgression from another, distantly related population-a process known as adaptive introgression. However, there are few explicit frameworks for jointly modelling introgression and positive selection, in order to detect these variants using genomic sequence data. Here, we develop an approach based on convolutional neural networks (CNNs). CNNs do not require the specification of an analytical model of allele frequency dynamics and have outperformed alternative methods for classification and parameter estimation tasks in various areas of population genetics. Thus, they are potentially well suited to the identification of adaptive introgression. Using simulations, we trained CNNs on genotype matrices derived from genomes sampled from the donor population, the recipient population and a related non-introgressed population, in order to distinguish regions of the genome evolving under adaptive introgression from those evolving neutrally or experiencing selective sweeps. Our CNN architecture exhibits 95% accuracy on simulated data, even when the genomes are unphased, and accuracy decreases only moderately in the presence of heterosis. As a proof of concept, we applied our trained CNNs to human genomic datasets-both phased and unphased-to detect candidates for adaptive introgression that shaped our evolutionary history.

Highlights

  • Ancient DNA studies have shown that human evolution during the Pleistocene was characterised by numerous episodes of interbreeding between distantly related groups (Green et al, 2010; Reich et al, 2010; Meyer et al, 2012; Prüfer et al, 2017; Kuhlwilm et al, 2016)

  • We designed a convolutional neural networks (CNNs) (Figure 1) that takes this concatenated matrix as input to distinguish between adaptive introgression scenarios and other types of neutral or selection scenarios

  • The CNN outputs the probability that the input matrix comes from a genomic region that underwent adaptive introgression

Read more

Summary

Introduction

Ancient DNA studies have shown that human evolution during the Pleistocene was characterised by numerous episodes of interbreeding between distantly related groups (Green et al, 2010; Reich et al, 2010; Meyer et al, 2012; Prüfer et al, 2017; Kuhlwilm et al, 2016). In the past few years, several methods have been developed to identify regions of present-day or ancient human genomes con taining haplotypes that were introgressed from other groups of hominins These include methods based on probabilistic models (Sankararaman et al, 2014, 2016; Steinrücken et al, 2018; Racimo et al, 2017a), on summary statistics (Vernot and Akey, 2014; Vernot et al, 2016; Racimo et al, 2017b; Durvasula and Sankararaman, 2019) and on ancestral recombination graph reconstruc tions (Kuhlwilm et al, 2016; Hubisz et al, 2020; Speidel et al, 2019). While recent 40 evidence suggests that a large proportion of Neanderthal ancestry was likely negatively selected (Harris and Nielsen, 2016; Juric et al, 2016), there is support for positive selection on a smaller proportion of the genome—a phenomenon known as adaptive introgression (AI) (Whitney et al., 2006; Hawks and Cochran, 2006; Racimo et al, 2015)

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call