Abstract

BackgroundGenetic polymorphisms can contribute to phenotypic differences amongst individuals, including disease risk and drug response. Characterization of genetic polymorphisms that modulate gene expression and/or protein function may facilitate the identification of the causal variants. Here, we present the architecture of genetic polymorphisms in the human genome focusing on those predicted to be potentially functional/under natural selection and the pathways that they reside.ResultsIn the human genome, polymorphisms that directly affect protein sequences and potentially affect function are the most constrained variants with the lowest single-nucleotide variant (SNV) density, least population differentiation and most significant enrichment of rare alleles. SNVs which potentially alter various regulatory sites, e.g. splicing regulatory elements, are also generally under negative selection.Interestingly, genes that regulate the expression of transcription/splicing factors and histones are conserved as a higher proportion of these genes is non-polymorphic, contain ultra-conserved elements (UCEs) and/or has no non-synonymous SNVs (nsSNVs)/coding INDELs. On the other hand, major histocompatibility complex (MHC) genes are the most polymorphic with SNVs potentially affecting the binding of transcription/splicing factors and microRNAs (miRNA) exhibiting recent positive selection (RPS). The drug transporter genes carry the most number of potentially deleterious nsSNVs and exhibit signatures of RPS and/or population differentiation. These observations suggest that genes that interact with the environment are highly polymorphic and targeted by RPS.ConclusionsIn conclusion, selective constraints are observed in coding regions, master regulator genes, and potentially functional SNVs. In contrast, genes that modulate response to the environment are highly polymorphic and under positive selection.

Highlights

  • Genetic polymorphisms can contribute to phenotypic differences amongst individuals, including disease risk and drug response

  • Polymorphisms are most constrained in coding regions Of the > 14 million polymorphisms in the human genome validated in the dbSNV database (Build 131), 38% of the polymorphisms are within the protein-coding genes while 62% resides in the intergenic regions

  • In conclusion, this study elucidates the overall architecture of the genetic polymorphisms, namely single-nucleotide variant (SNV) and Insertion/ deletion (INDEL), in the human genome

Read more

Summary

Introduction

Genetic polymorphisms can contribute to phenotypic differences amongst individuals, including disease risk and drug response. Characterization of genetic polymorphisms that modulate gene expression and/or protein function may facilitate the identification of the causal variants. Genetic polymorphisms may contribute to the differences in disease risks and drug responses amongst different individuals. These polymorphisms are likely to alter gene expression or protein function leading to modulation of cellular function and influencing disease risk or drug response. To identify the single or a group of causal variants for a particular phenotype from a pool of more than 100 million polymorphisms is like ‘finding a needle in a haystack’ and remains a great challenge since not all genetic variants are functionally important

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.