Abstract

Identification of responsive genes to an extra-cellular cue enables characterization of pathophysiologically crucial biological processes. Deep sequencing technologies provide a powerful means to identify responsive genes, which creates a need for computational methods able to analyze dynamic and multi-level deep sequencing data. To answer this need we introduce here a data-driven algorithm, SPINLONG, which is designed to search for genes that match the user-defined hypotheses or models. SPINLONG is applicable to various experimental setups measuring several molecular markers in parallel. To demonstrate the SPINLONG approach, we analyzed ChIP-seq data reporting PolII, estrogen receptor (), H3K4me3 and H2A.Z occupancy at five time points in the MCF-7 breast cancer cell line after estradiol stimulus. We obtained 777 early responsive genes and compared the biological functions of the genes having binding within 20 kb of the transcription start site (TSS) to genes without such binding site. Our results show that the non-genomic action of via the MAPK pathway, instead of direct binding, may be responsible for early cell responses to activation. Our results also indicate that the responsive genes triggered by the genomic pathway are transcribed faster than those without binding sites. The survival analysis of the 777 responsive genes with 150 primary breast cancer tumors and in two independent validation cohorts indicated the ATAD3B gene, which does not have binding site within 20 kb of its TSS, to be significantly associated with poor patient survival.

Highlights

  • The identification of genes whose expression patterns are altered due to a stimulus is essential as it provides a basis to understand which signaling and metabolic pathways are influenced as a consequence of a stimulus

  • Such cues trigger signal cascades that lead to altered expression of a number of genes in the cell nucleus; a key challenge in biomedicine is to identify which genes respond to a specific stimulus

  • These so called response genes can be investigated on a whole-genome scale with genomic sequencing, which is a technology that can quantify protein binding to DNA or gene activation

Read more

Summary

Introduction

The identification of genes whose expression patterns are altered due to a stimulus is essential as it provides a basis to understand which signaling and metabolic pathways are influenced as a consequence of a stimulus. The majority of approaches to identify stimulus-regulated changes in gene expression rely on the relative abundance of mRNA molecules, either measured with microarrays or with RNA-seq, as an indirect indication of transcriptional initiation [1,2,3]. A reliable indication of an actively transcribed gene is the presence of RNA polymerase II (PolII) protein complex in the body of the gene. PolII generates the precursors of most mRNA, snRNA and miRNA molecules, and its activity is modulated by histone modifications [4]. We hypothesized that considering PolII together with histone modifications could provide a reliable indication of changes in the rate of transcriptional activity at responding loci

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call