Interpretable prediction of mRNA abundance from promoter sequence using contextual regression models.

Song Wang,Wei Wang

doi:10.1093/nargab/lqae055

Abstract

While machine learning models have been successfully applied to predicting gene expression from promoter sequences, it remains a great challenge to derive intuitive interpretation of the model and reveal DNA motif grammar such as motif cooperation and distance constraint between motif sites. Previous interpretation approaches are often time-consuming or have difficulty to learn the combinatory rules. In this work, we designed interpretable neural network models to predict the mRNA expression levels from DNA sequences. By applying the Contextual Regression framework we developed, we extracted weighted features to cluster samples into different groups, which have different gene expression levels. We performed motif analysis in each cluster and found motifs with active or repressive regulation on gene expression. By comparing the co-occurrence locations of discovered motifs, we also uncovered multiple grammars of motif combination including communities of cooperative motifs and distance constraints between motif pairs. These results revealed new insights of the regulatory architecture of promoter sequences.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Interpretable prediction of mRNA abundance from promoter sequence using contextual regression models.

Abstract

Talk to us

Similar Papers

More From: NAR genomics and bioinformatics

Lead the way for us

Journal: NAR genomics and bioinformatics	Publication Date: Apr 4, 2024
License type: CC BY-NC 4.0

Similar Papers

Decision letter: Promoter sequence and architecture determine expression variability and confer robustness to genetic variants
George H Perry
-
George H PerryGeorge H Perry
07 Sep 2022
07 Sep 2022

Author response: Promoter sequence and architecture determine expression variability and confer robustness to genetic variants
Hjörleifur Einarsson ... Marco Salvatore
-
Hjörleifur Einarsson, et. al.Hjörleifur Einarsson ... Marco Salvatore
03 Nov 2022
03 Nov 2022

Dynamic and Temporal Transcriptomic Analysis Reveals Ferroptosis-Mediated Antileukemia Activity of S-Dimethylarsino-Glutathione: Insights into Novel Therapeutic Strategy
Xiaohan Xu ... Hongyan Li
CCS Chemistry | VOL. 4
Xiaohan Xu, et. al.Xiaohan Xu ... Hongyan Li
30 Apr 2021
CCS Chemistry | VOL. 4

Control of Splicing Efficiency by the Mouse Histone H2a Element in a Murine Leukemia Virus–based Retroviral Vector
Jun-Tae Lee ... Sunyoung Kim
Molecular Therapy | VOL. 15
Jun-Tae Lee, et. al.Jun-Tae Lee ... Sunyoung Kim
01 Jan 2007
Molecular Therapy | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Interpretable prediction of mRNA abundance from promoter sequence using contextual regression models.

Abstract

Talk to us

Similar Papers

More From: NAR genomics and bioinformatics