Abstract

BackgroundAccurate inference of genetic ancestry is of fundamental interest to many biomedical, forensic, and anthropological research areas. Genetic ancestry memberships may relate to genetic disease risks. In a genome association study, failing to account for differences in genetic ancestry between cases and controls may also lead to false-positive results. Although a number of strategies for inferring and taking into account the confounding effects of genetic ancestry are available, applying them to large studies (tens thousands samples) is challenging. The goal of this study is to develop an approach for inferring genetic ancestry of samples with unknown ancestry among closely related populations and to provide accurate estimates of ancestry for application to large-scale studies.MethodsIn this study we developed a novel distance-based approach, Ancestry Inference using Principal component analysis and Spatial analysis (AIPS) that incorporates an Inverse Distance Weighted (IDW) interpolation method from spatial analysis to assign individuals to population memberships.ResultsWe demonstrate the benefits of AIPS in analyzing population substructure, specifically related to the four most commonly used tools EIGENSTRAT, STRUCTURE, fastSTRUCTURE, and ADMIXTURE using genotype data from various intra-European panels and European-Americans. While the aforementioned commonly used tools performed poorly in inferring ancestry from a large number of subpopulations, AIPS accurately distinguished variations between and within subpopulations.ConclusionsOur results show that AIPS can be applied to large-scale data sets to discriminate the modest variability among intra-continental populations as well as for characterizing inter-continental variation. The method we developed will protect against spurious associations when mapping the genetic basis of a disease. Our approach is more accurate and computationally efficient method for inferring genetic ancestry in the large-scale genetic studies.

Highlights

  • Accurate inference of genetic ancestry is of fundamental interest to many biomedical, forensic, and anthropological research areas

  • Genome wide association studies have a larger number of SNPs (p) compared to the size of samples (n), in which case principal components analysis is performed in the Q-mode and can be obtained by calculating the eigenvectors and eigenvalues of a covariance matrix whose rank is at most n-1

  • Application in European subpopulations and European ancestry informative markers (AIMs) To demonstrate the application of AIPS, we performed an intra-European analysis involving 4376 individuals of European descent with a set of 25,732 pre-selected known Intra-European AIMs

Read more

Summary

Introduction

Accurate inference of genetic ancestry is of fundamental interest to many biomedical, forensic, and anthropological research areas. In a genome association study, failing to account for differences in genetic ancestry between cases and controls may lead to false-positive results. The goal of this study is to develop an approach for inferring genetic ancestry of samples with unknown ancestry among closely related populations and to provide accurate estimates of ancestry for application to large-scale studies. Genome-wide association studies (GWAS) have helped identify a large number of allelic variants for common complex traits and diseases. Population stratification, the presence of systematic allele frequency differences between populations or subpopulations, can cause spurious associations and distortions in effect estimates between genetic variants and disease [1,2,3,4,5]. Basing analysis on AIMs rather than all markers that might have been analyzed in a GWAS allows a more parsimonious use of the data and the markers are typically selected to avoid strong linkage disequilibrium among the markers

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call