Abstract

Whole-genome studies of genetic variation are now performed routinely and have accelerated the identification of disease-associated allelic variants, positive selection, recombination, and structural variation. However, these studies are sensitive to the presence of outlier data from individuals of different ancestry than the rest of the sample. Currently, the most common method of excluding outlier individuals is to collect a population sample and exclude outliers after genome-wide data have been collected. Here we show that a small collection of 20-27 polymorphic Alu insertions, selected using a principal component-based method with genetic ancestry estimates, may be used to easily assign Africans, East Asians, and Europeans to their population of origin. In addition, we show that samples from a geographically and genetically intermediate population (in our study, samples from India) can be identified within the original sample of Africans, East Asians, and Europeans. Finally, we show that outlier individuals from neighboring geographic regions (in our study, Yemen and sub-Saharan Africa) can be identified. These results will be of value in preselection of samples for more in-depth analysis as well as customized identification of maximally informative polymorphic markers for regional studies.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.