F-Seq2: improving the feature density based peak caller with dynamic statistics.

Nanxiang Zhao,Alan P Boyle

doi:10.1093/nargab/lqab012

Abstract

Genomic and epigenomic features are captured at a genome-wide level by using high-throughput sequencing (HTS) technologies. Peak calling delineates features identified in HTS experiments, such as open chromatin regions and transcription factor binding sites, by comparing the observed read distributions to a random expectation. Since its introduction, F-Seq has been widely used and shown to be the most sensitive and accurate peak caller for DNase I hypersensitive site (DNase-seq) data. However, the first release (F-Seq1) has two key limitations: lack of support for user-input control datasets, and poor test statistic reporting. These constrain its ability to capture systematic and experimental biases inherent to the background distributions in peak prediction, and to subsequently rank predicted peaks by confidence. To address these limitations, we present F-Seq2, which combines kernel density estimation and a dynamic ‘continuous’ Poisson test to account for local biases and accurately rank candidate peaks. The output of F-Seq2 is suitable for irreproducible discovery rate analysis as test statistics are calculated for individual candidate summits, allowing direct comparison of predictions across replicates. These improvements significantly boost the performance of F-Seq2 for ATAC-seq and ChIP-seq datasets, outperforming competing peak callers used by the ENCODE Consortium in terms of precision and recall.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: NAR Genomics and Bioinformatics	Publication Date: Jan 6, 2021
Citations: 9	License type: CC BY-NC 4.0

R Discovery Prime

R Discovery Prime

F-Seq2: improving the feature density based peak caller with dynamic statistics.

Abstract

Talk to us

Similar Papers

More From: NAR Genomics and Bioinformatics

Lead the way for us

Similar Papers

Population size estimation for quality control of ChIP-Seq datasets.
Semyon K Kolmykov ... Ruslan N Sharipov
PloS one | VOL. 14
Semyon K Kolmykov, et. al.Semyon K Kolmykov ... Ruslan N Sharipov
29 Aug 2019
PloS one | VOL. 14

PeakRanger: A cloud-enabled peak caller for ChIP-seq data
Xin Feng ... Robert Grossman
BMC Bioinformatics | VOL. 12
Xin Feng, et. al.Xin Feng ... Robert Grossman
09 May 2011
BMC Bioinformatics | VOL. 12

Identifying peaks in *-seq data using shape information.
Francesco Strino ... Michael Lappe
BMC Bioinformatics | VOL. Suppl 17 5
Francesco Strino, et. al.Francesco Strino ... Michael Lappe
06 Jun 2016
BMC Bioinformatics | VOL. Suppl 17 5

Computational analysis of CLIP-seq data.
Michael Uhl ... Gianluca Corrado
Methods | VOL. 118-119
Michael Uhl, et. al.Michael Uhl ... Gianluca Corrado
22 Feb 2017
Methods | VOL. 118-119

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

F-Seq2: improving the feature density based peak caller with dynamic statistics.

Abstract

Talk to us

Similar Papers

More From: NAR Genomics and Bioinformatics