Productive visualization of high-throughput sequencing data using the SeqCode open portable platform

Enrique Blanco,Mar González-Ramírez,Luciano Di Croce

doi:10.1038/s41598-021-98889-7

Abstract

Large-scale sequencing techniques to chart genomes are entirely consolidated. Stable computational methods to perform primary tasks such as quality control, read mapping, peak calling, and counting are likewise available. However, there is a lack of uniform standards for graphical data mining, which is also of central importance. To fill this gap, we developed SeqCode, an open suite of applications that analyzes sequencing data in an elegant but efficient manner. Our software is a portable resource written in ANSI C that can be expected to work for almost all genomes in any computational configuration. Furthermore, we offer a user-friendly front-end web server that integrates SeqCode functions with other graphical analysis tools. Our analysis and visualization toolkit represents a significant improvement in terms of performance and usability as compare to other existing programs. Thus, SeqCode has the potential to become a key multipurpose instrument for high-throughput professional analysis; further, it provides an extremely useful open educational platform for the world-wide scientific community. SeqCode website is hosted at http://ldicrocelab.crg.eu, and the source code is freely distributed at https://github.com/eblancoga/seqcode.

Highlights

Large-scale sequencing techniques to chart genomes are entirely consolidated
Powerful bioinformatic tools are available to manage this volume of data at a primary stage: (i) quality control profilers evaluate distinct scoring metrics on raw information[4,5,6]; (ii) mapping algorithms identify the location of each read on the g enome[7,8,9]; (iii) peak callers find clusters of reads significantly enriched in certain genomic regions in the sample map file[10,11,12]; (iv) genome browsers are useful to visualize genome-wide binding profiles and p eaks[13,14,15,16]; and (v) other auxiliary applications convert intermediate files into the appropriate data formats[17,18,19,20]
Information on a particular genome assembly is loaded from two external files that must be supplied by the user: (i) the chromosome size file (ChromInfo.txt), and (ii) the gene transcript annotations, as provided by the RefSeq c onsortium[26]

Summary

Introduction

Large-scale sequencing techniques to chart genomes are entirely consolidated. Stable computational methods to perform primary tasks such as quality control, read mapping, peak calling, and counting are likewise available. There is a lack of uniform standards for graphical data mining, which is of central importance To fill this gap, we developed SeqCode, an open suite of applications that analyzes sequencing data in an elegant but efficient manner. We offer a user-friendly front-end web server that integrates SeqCode functions with other graphical analysis tools. Current high-throughput sequencing techniques (e.g. ChIP-seq, ATAC-seq, and RNA-seq) can use a single run to identify the repertoire of functional characteristics of the genome. We first illustrate the main characteristics of SeqCode, introduce the collection of principal SeqCode features to perform high-quality graphical analysis of sequencing data, and propose a standardized nomenclature of representations. SeqCode is entirely focused on the graphical analysis of 1D genomic data (e.g. ChIP-seq, RNA-seq). We comprehensively review the existing literature on similar tools to evaluate our software in comparison to current approaches

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: Oct 1, 2021
Citations: 10	License type: open-access

R Discovery Prime

R Discovery Prime

Productive visualization of high-throughput sequencing data using the SeqCode open portable platform

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Quality control of global solar radiation data with satellite-based products
Ruben Urraca ... Andres Sanz-Garcia
Solar Energy | VOL. 158
Ruben Urraca, et. al.Ruben Urraca ... Andres Sanz-Garcia
22 Sep 2017
Solar Energy | VOL. 158

SPAN and JBR
Oleg Shpynov ... Roman Chernyatchik
-
Oleg Shpynov, et. al.Oleg Shpynov ... Roman Chernyatchik
01 Aug 2021
01 Aug 2021

ChIA-PIPE: A fully automated pipeline for comprehensive ChIA-PET data analysis and visualization.
Byoungkoo Lee ... Jiahui Wang
Science Advances | VOL. 6
Byoungkoo Lee, et. al.Byoungkoo Lee ... Jiahui Wang
10 Jul 2020
Science Advances | VOL. 6

AIAP: A Quality Control and Integrative Analysis Package to Improve ATAC-seq Data Analysis
Shaopeng Liu ... Bo Zhang
Genomics, Proteomics & Bioinformatics | VOL. 19
Shaopeng Liu, et. al.Shaopeng Liu ... Bo Zhang
15 Jul 2021
Genomics, Proteomics & Bioinformatics | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Productive visualization of high-throughput sequencing data using the SeqCode open portable platform

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports