Abstract

Current Genome-Wide Association Studies (GWAS) rely on genotype imputation to increase statistical power, improve fine-mapping of association signals, and facilitate meta-analyses. Due to the complex demographic history of Latin America and the lack of balanced representation of Native American genomes in current imputation panels, the discovery of locally relevant disease variants is likely to be missed, limiting the scope and impact of biomedical research in these populations. Therefore, the necessity of better diversity representation in genomic databases is a scientific imperative. Here, we expand the 1,000 Genomes reference panel (1KGP) with 134 Native American genomes (1KGP + NAT) to assess imputation performance in Latin American individuals of mixed ancestry. Our panel increased the number of SNPs above the GWAS quality threshold, thus improving statistical power for association studies in the region. It also increased imputation accuracy, particularly in low-frequency variants segregating in Native American ancestry tracts. The improvement is subtle but consistent across countries and proportional to the number of genomes added from local source populations. To project the potential improvement with a higher number of reference genomes, we performed simulations and found that at least 3,000 Native American genomes are needed to equal the imputation performance of variants in European ancestry tracts. This reflects the concerning imbalance of diversity in current references and highlights the contribution of our work to reducing it while complementing efforts to improve global equity in genomic research.

Highlights

  • Over the past years, GWAS have identified thousands of genetic associations to multiple phenotypes (MacArthur et al, 2017; Visscher et al, 2017), targets for potential new drugs (Agrawal and Brown 2014; Flannick et al, 2014; Nelson et al, 2015), and facilitated disease stratification (Chatterjee, Shi, and GarcíaClosas 2016)

  • Our results show that after adding 134 Native American genomes to the most widely used reference panel of global variation, we observe a promising trend of improvement

  • GWAS requires large sample sizes to detect genetic associations to complex phenotypes, and more so as the field moves toward studying rare variants (Collins 2012; Amendola et al, 2018; AbulHusn and Kenny 2019)

Read more

Summary

Introduction

GWAS have identified thousands of genetic associations to multiple phenotypes (MacArthur et al, 2017; Visscher et al, 2017), targets for potential new drugs (Agrawal and Brown 2014; Flannick et al, 2014; Nelson et al, 2015), and facilitated disease stratification (Chatterjee, Shi, and GarcíaClosas 2016). The findings of large-scale GWAS performed in populations of European descent have limited portability to other ancestry groups (Duncan et al, 2019; Sirugo, Williams, and Tishkoff 2019) due to population substructure. This represents a major limitation in the case of Latin American populations as they are the result of recent admixture primarily between Native American, European, and African populations, and only 1.3% of both discovery and replication studies have been performed in these populations (Mills and Rahal 2019). If the current bias in catalogs of human variation persists, many population-specific variants will be overlooked, and precision medicine strategies will not benefit all populations (Martin et al, 2019)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.