Abstract

Identification of diffuse signals from the chromatin immunoprecipitation and high-throughput massively parallel sequencing (ChIP-Seq) technology poses significant computational challenges, and there are few methods currently available. We present a novel global clustering approach to enrich diffuse CHIP-Seq signals of RNA polymerase II and histone 3 lysine 4 trimethylation (H3K4Me3) and apply it to identify putative long intergenic non-coding RNAs (lincRNAs) in macrophage cells. Our global clustering method compares favorably to the local clustering method SICER that was also designed to identify diffuse CHIP-Seq signals. The validity of the algorithm is confirmed at several levels. First, 8 out of a total of 11 selected putative lincRNA regions in primary macrophages respond to lipopolysaccharides (LPS) treatment as predicted by our computational method. Second, the genes nearest to lincRNAs are enriched with biological functions related to metabolic processes under resting conditions but with developmental and immune-related functions under LPS treatment. Third, the putative lincRNAs have conserved promoters, modestly conserved exons, and expected secondary structures by prediction. Last, they are enriched with motifs of transcription factors such as PU.1 and AP.1, previously shown to be important lineage determining factors in macrophages, and 83% of them overlap with distal enhancers markers. In summary, GCLS based on RNA polymerase II and H3K4Me3 CHIP-Seq method can effectively detect putative lincRNAs that exhibit expected characteristics, as exemplified by macrophages in the study.

Highlights

  • Unlike messenger RNA, non-coding RNAs are a class of RNAs that are not intermediates between DNA and protein products

  • One can rationalize that the Type 1 peaks reflect the dynamics of ‘‘diffusion-reaction’’, an emergent picture of limited amount of enzymes competing for specific genomic loci, assuming that the enzymes are diffused to targeted loci and that the total loading capacity of the enzymes within a particular actively transcribed locus is relatively constant

  • chromatin immunoprecipitation (ChIP)-Seq with signatures such as histone modifications and protein binding distributions has emerged as a new trend to predict critical genomic features

Read more

Summary

Introduction

Unlike messenger RNA, non-coding RNAs (ncRNAs) are a class of RNAs that are not intermediates between DNA and protein products. Rather than being regarded as ‘‘transcriptional noise’’, there is emerging recognition and appreciation of the functional importance of these ncRNAs in health and diseases, such as cancer [1]. According to the length of transcripts, ncRNAs can be classified into three categories: small RNA (#25 bp), mediumlength RNA (,30–200 bp), and long RNA (longer than 200 bp) [2]. It was previously thought that ncRNAs lacked evolutionary conservation; recent studies revealed compelling evidence supporting the conservation of lincRNAs [3,4]. There is emerging evidence that lincRNAs play roles in regulation of gene expression, in part through targeting transcriptional complexes to specific genomic locations [5,6]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.