Abstract
BackgroundGenotype networks are representations of genetic variation data that are complementary to phylogenetic trees. A genotype network is a graph whose nodes are genotypes (DNA sequences) with the same broadly defined phenotype. Two nodes are connected if they differ in some minimal way, e.g., in a single nucleotide.ResultsWe analyze human genome variation data from the 1,000 genomes project, and construct haploid genotype (haplotype) networks for 12,235 protein coding genes. The structure of these networks varies widely among genes, indicating different patterns of variation despite a shared evolutionary history. We focus on those genes whose genotype networks show many cycles, which can indicate homoplasy, i.e., parallel or convergent evolution, on the sequence level.ConclusionFor 42 genes, the observed number of cycles is so large that it cannot be explained by either chance homoplasy or recombination. When analyzing possible explanations, we discovered evidence for positive selection in 21 of these genes and, in addition, a potential role for constrained variation and purifying selection. Balancing selection plays at most a small role. The 42 genes with excess cycles are enriched in functions related to immunity and response to pathogens. Genotype networks are representations of genetic variation data that can help understand unusual patterns of genomic variation.Electronic supplementary materialThe online version of this article (doi:10.1186/s12862-016-0722-0) contains supplementary material, which is available to authorized users.
Highlights
Genotype networks are representations of genetic variation data that are complementary to phylogenetic trees
Using a test based on the hypergeometric distribution [60], we found no significant overlap between the genes that showed evidence of positive selection in the XPCLR test and those genes among our 42 focal genes that (i) have significantly fewer synonymous mutations inside the squares than outside the squares of their haplotype network (2 common genes) or (ii) had been identified in several previous studies as being subject to positive selection (3 common genes)
We show that the haplotype networks of 42 genes display a significant excess of squares that cannot be explained by chance homoplasy, genetic recombination, or balancing selection alone
Summary
Genotype networks are representations of genetic variation data that are complementary to phylogenetic trees. A genotype network is a graph whose nodes are genotypes (DNA sequences) with the same broadly defined phenotype. The patterns and causes of genotypic variation in human genes have been a focus of great recent interest in evolutionary biology. Different processes such as natural selection, genetic recombination, genetic drift, demography, as well as physicochemical properties of cells, can influence this diversity. We use a novel approach based on genotype networks to represent and analyze genetic variation in human genes. Genotype networks are graphs that consist of nodes, which correspond to genotypes with the same phenotype, where sameness can be defined as.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.