LOcating Non-Unique matched Tags (LONUT) to Improve the Detection of the Enriched Regions for ChIP-seq Data

Rui Wang,Hang-Kai Hsu,Yao Wang,Xun Lan,Yu-Wei Leu,Yisong Wang,Tim H.-M Huang,Pei-Yin Hsu,Victor X Jin,Peggy J Farnham,Adam Blattler

doi:10.1371/journal.pone.0067788

Abstract

One big limitation of computational tools for analyzing ChIP-seq data is that most of them ignore non-unique tags (NUTs) that match the human genome even though NUTs comprise up to 60% of all raw tags in ChIP-seq data. Effectively utilizing these NUTs would increase the sequencing depth and allow a more accurate detection of enriched binding sites, which in turn could lead to more precise and significant biological interpretations. In this study, we have developed a computational tool, LOcating Non-Unique matched Tags (LONUT), to improve the detection of enriched regions from ChIP-seq data. Our LONUT algorithm applies a linear and polynomial regression model to establish an empirical score (ES) formula by considering two influential factors, the distance of NUTs to peaks identified using uniquely matched tags (UMTs) and the enrichment score for those peaks resulting in each NUT being assigned to a unique location on the reference genome. The newly located tags from the set of NUTs are combined with the original UMTs to produce a final set of combined matched tags (CMTs). LONUT was tested on many different datasets representing three different characteristics of biological data types. The detected sites were validated using de novo motif discovery and ChIP-PCR. We demonstrate the specificity and accuracy of LONUT and show that our program not only improves the detection of binding sites for ChIP-seq, but also identifies additional binding sites.

Highlights

Next-generation sequencing technologies have been widely used to address many biological and medical questions on a genomewide scale
The first step of LOcating Non-Unique matched Tags (LONUT) is to divide the input dataset into two subsets: a set of unique matched tags (UMTs) and a set of non-unique tags (NUTs) based on the output dataset from the Bowtie aligned tags file
We combine the set of newly located tags from the NUTs with the set of original UMTs to produce a final set of combined matched tags (CMTs)

Summary

Introduction

Despite the large number of computational tools, such as MACS [12], QuEST [13], SISSRs [14] and many other peak identification programs [15,16,17,18,19,20,21] for ChIP-seq data, and Cufflinks [22], Scripture [23] and SpliceTrap [24] for RNA-seq data, that have been developed to analyze genomic datasets generated from sequencing-based technologies, limitations in data analysis still exist. NUTs comprise up to 60% of all raw tags [25] Utilizing these NUTs would increase the sequencing depth and allow a more accurate detection of enriched binding sites, which in turn may lead to more precise and significant biological insights

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: Jun 25, 2013
Citations: 38	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

LOcating Non-Unique matched Tags (LONUT) to Improve the Detection of the Enriched Regions for ChIP-seq Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

Endogenous Retrovirus Derepression Drives Ectopic UGT2B17 Overexpression in Multiple Myeloma Cells: Molecular Sequelae and Pathophysiological Implications
Spyros Ioannis Papamichos ... Ioannis Kotsianidis
Blood | VOL. 130
Spyros Ioannis Papamichos, et. al.Spyros Ioannis Papamichos ... Ioannis Kotsianidis
25 Jun 2021
Blood | VOL. 130

Extended Sunflower Hidden Markov Models for the recognition of homotypic cis-regulatory modules}
...
-
, et. al. ...
01 Jan 2013
01 Jan 2013

FROM BINDING MOTIFS IN CHIP-SEQ DATA TO IMPROVED MODELS OF TRANSCRIPTION FACTOR BINDING SITES
Ivan Kulakovskiy ... Vsevolod Makeev
Journal of Bioinformatics and Computational Biology | VOL. 11
Ivan Kulakovskiy, et. al.Ivan Kulakovskiy ... Vsevolod Makeev
01 Feb 2013
Journal of Bioinformatics and Computational Biology | VOL. 11

Pinpointing transcription factor binding sites from ChIP-seq data with SeqSite
Xi Wang ... Xuegong Zhang
BMC Systems Biology | VOL. 5
Xi Wang, et. al.Xi Wang ... Xuegong Zhang
01 Jan 2010
BMC Systems Biology | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

LOcating Non-Unique matched Tags (LONUT) to Improve the Detection of the Enriched Regions for ChIP-seq Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE