Expression quantitative locus mapping for identification of hotspots using an empirical Bayes mixture model

Guanglong Jiang,Yingqiang Fu,Shirin Ardeshir Rouhani Fard,Zhigao Li,Pengyue Zhang,Lang Li,Lijun Cheng

doi:10.1504/ijcbdd.2017.083882

Abstract

Identification of genomic regions that regulate gene expression can help our understanding of the mechanisms underlying genetic contributions to phenotypic variations. Hence, we consider a mixture model to locate candidate genomic regions that are more frequently associated with gene expression traits. A modified two-sample t-statistic was used, and single-nucleotide polymorphisms (SNPs) with P-values <10-5 were considered for a subsequent two-component negative binomial mixture model. An expectation-maximisation algorithm was adopted to identify the parameters involved in the model. The SNPs were then ranked based on their false discovery rate (FDR) values. Any SNP with a FDR value <1% was considered as a potential hotspot. Three independent datasets were used to replicate the findings. A number of common hotspots were identified, and many hotspots have annotated function as the binding site of transcription factors or histone proteins.

Full Text