Abstract
Identification of genomic regions that regulate gene expression can help our understanding of the mechanisms underlying genetic contributions to phenotypic variations. Hence, we consider a mixture model to locate candidate genomic regions that are more frequently associated with gene expression traits. A modified two-sample t-statistic was used, and single-nucleotide polymorphisms (SNPs) with P-values <10-5 were considered for a subsequent two-component negative binomial mixture model. An expectation-maximisation algorithm was adopted to identify the parameters involved in the model. The SNPs were then ranked based on their false discovery rate (FDR) values. Any SNP with a FDR value <1% was considered as a potential hotspot. Three independent datasets were used to replicate the findings. A number of common hotspots were identified, and many hotspots have annotated function as the binding site of transcription factors or histone proteins.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have