COPAR: A ChIP-Seq Optimal Peak Analyzer.

Binhua Tang,Victor X Jin,Xihan Wang

doi:10.1155/2017/5346793

Abstract

Sequencing data quality and peak alignment efficiency of ChIP-sequencing profiles are directly related to the reliability and reproducibility of NGS experiments. Till now, there is no tool specifically designed for optimal peak alignment estimation and quality-related genomic feature extraction for ChIP-sequencing profiles. We developed open-sourced COPAR, a user-friendly package, to statistically investigate, quantify, and visualize the optimal peak alignment and inherent genomic features using ChIP-seq data from NGS experiments. It provides a versatile perspective for biologists to perform quality-check for high-throughput experiments and optimize their experiment design. The package COPAR can process mapped ChIP-seq read file in BED format and output statistically sound results for multiple high-throughput experiments. Together with three public ChIP-seq data sets verified with the developed package, we have deposited COPAR on GitHub under a GNU GPL license.

Highlights

Next-generation sequencing (NGS) integrated with ChIP technology provides a genome-wide perspective for biomedical research and clinical diagnosis applications [1,2,3].Data quality and peak alignment of ChIP-sequencing profiles are directly related to the reliability and reproducibility of analysis results
The mostly investigated items in ChIP-seq peak calling procedures are peak number, false discovery rate (FDR), corresponding bin-size, and other statistical thresholds selected in each analysis
Few literatures or application notes focus on such topics; we propose a flexible package based on feature extraction and signal processing algorithms for solving such an argument-selection optimization problem in optimal peak alignment

Summary

Introduction

Data quality and peak alignment of ChIP-sequencing profiles are directly related to the reliability and reproducibility of analysis results. The mostly investigated items in ChIP-seq peak calling procedures are peak number, false discovery rate (FDR), corresponding bin-size, and other statistical thresholds selected in each analysis. Without exception, such arguments form impenetrable barriers for biologists and bioinformaticians to choose a suitable pair condition for analyzing experimental results. Few literatures or application notes focus on such topics; we propose a flexible package based on feature extraction and signal processing algorithms for solving such an argument-selection optimization problem in optimal peak alignment

Methods

Results

Conclusion