Post-translational modifications of histone play a key role in controlling gene expression, as they are involved in gene transcription regulation through controlling DNA accessibility or recruiting specific transcription factors. Developing computational methods to predict gene expression levels from histone modification data may facilitate understanding their function in gene regulation and contribute to develop ‘epigenetic drugs’ for disease treatment. Several studies have reported that transcription termination plays a crucial role in gene regulation as it is able to premature termination and determine the cellular fate of transcripts. However, previous work mainly focused on the histone characteristics flanking transcription start site (TSS) of genes, while ignoring the histone modification features around transcription termination site (TTS). Thus, we introduced a hybrid convolutional and bi-directional long short-term memory network with attention mechanism to predict gene expression from histone modification signals in both regions. Our model achieved higher results than the state-of-the-art model in both classification and regression tasks to infer gene expression level. The predictive results of model demonstrated that histone signals of TTS can provide additional information to improve the model performance. In addition, attention weights and transcription factor binding data indicated that histone modifications in TSS might regulate transcription by recruiting TFs, but the regulation mechanisms of it in TTS need to be further explored.
Read full abstract