Abstract

Using principal component (PC) analysis, we studied the genetic constitution of 3,112 individuals from Europe as portrayed by more than 270,000 single nucleotide polymorphisms (SNPs) genotyped with the Illumina Infinium platform. In cohorts where the sample size was >100, one hundred randomly chosen samples were used for analysis to minimize the sample size effect, resulting in a total of 1,564 samples. This analysis revealed that the genetic structure of the European population correlates closely with geography. The first two PCs highlight the genetic diversity corresponding to the northwest to southeast gradient and position the populations according to their approximate geographic origin. The resulting genetic map forms a triangular structure with a) Finland, b) the Baltic region, Poland and Western Russia, and c) Italy as its vertexes, and with d) Central- and Western Europe in its centre. Inter- and intra- population genetic differences were quantified by the inflation factor lambda (λ) (ranging from 1.00 to 4.21), fixation index (Fst) (ranging from 0.000 to 0.023), and by the number of markers exhibiting significant allele frequency differences in pair-wise population comparisons. The estimated lambda was used to assess the real diminishing impact to association statistics when two distinct populations are merged directly in an analysis. When the PC analysis was confined to the 1,019 Estonian individuals (0.1% of the Estonian population), a fine structure emerged that correlated with the geography of individual counties. With at least two cohorts available from several countries, genetic substructures were investigated in Czech, Finnish, German, Estonian and Italian populations. Together with previously published data, our results allow the creation of a comprehensive European genetic map that will greatly facilitate inter-population genetic studies including genome wide association studies (GWAS).

Highlights

  • Over the last few years, the number of genome-wide association studies GWAS has increased markedly and, in concert, these efforts have led to the identification of a large number of new susceptibility loci for common multi-factorial disorders [1]

  • In the present study using autosomal SNPs and high density genotyping, we have focused on the genetic structure of the Baltic, Finnish and other North-Eastern European populations, while populations from Western and Southern Europe were included mainly for comparison (Figure 2)

  • Previous studies have focused upon the genetic structure in Central and Western Europe [11,12,13], Northern Europe [17,25] or studied US Americans of European and Ashkenazi-Jewish descent [14,15]

Read more

Summary

Introduction

Over the last few years, the number of genome-wide association studies GWAS has increased markedly and, in concert, these efforts have led to the identification of a large number of new susceptibility loci for common multi-factorial disorders [1]. The underlying technology is developing rapidly and is currently moving from the use of high density SNP arrays towards medical re-sequencing of large genomic regions. Given this development, the availability of thoroughly phenotyped patient and control samples is becoming even more important. With the vast amount of the genome-wide data available, the actual extent and relevance of population genetic differences can be clarified with high confidence for most commonly used SNP sets

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.