Abstract

BackgroundThe explosion of NGS (Next Generation Sequencing) sequence data requires a huge effort in Bioinformatics methods and analyses. The creation of dedicated, robust and reliable pipelines able to handle dozens of samples from raw FASTQ data to relevant biological data is a time-consuming task in all projects relying on NGS. To address this, we created a generic and modular toolbox for developing such pipelines.ResultsTOGGLE (TOolbox for Generic nGs anaLysEs) is a suite of tools able to design pipelines that manage large sets of NGS softwares and utilities. Moreover, TOGGLE offers an easy way to manipulate the various options of the different softwares through the pipelines in using a single basic configuration file, which can be changed for each assay without having to change the code itself. We also describe one implementation of TOGGLE in a complete analysis pipeline designed for SNP discovery for large sets of genomic data, ready to use in different environments (from a single machine to HPC clusters).ConclusionTOGGLE speeds up the creation of robust pipelines with reliable log tracking and data flow, for a large range of analyses. Moreover, it enables Biologists to concentrate on the biological relevance of results, and change the experimental conditions easily. The whole code and test data are available at https://github.com/SouthGreenPlatform/TOGGLE.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-015-0795-6) contains supplementary material, which is available to authorized users.

Highlights

  • The explosion of Next Generation Sequencing (NGS) ( Generation Sequencing) sequence data requires a huge effort in Bioinformatics methods and analyses

  • What we propose here is a set of packages designed for fast implementation of robust and reliable pipelines

  • We constructed different level of log files: at global level (GLOBAL_ANALYSIS_date on Fig. 3); at sample/individuals level, and at package per individual level. This pipeline is a classical one for DNAseq analyses from FASTQ sequences to BAM files Variant Call Format (VCF) files, but with a lot of control regarding file structure and format, and is easy to manage in terms of specific and global options through the software.config.txt file

Read more

Summary

Results

TOGGLE (TOolbox for Generic nGs anaLysEs) is a suite of tools able to design pipelines that manage large sets of NGS softwares and utilities. TOGGLE offers an easy way to manipulate the various options of the different softwares through the pipelines in using a single basic configuration file, which can be changed for each assay without having to change the code itself. We describe one implementation of TOGGLE in a complete analysis pipeline designed for SNP discovery for large sets of genomic data, ready to use in different environments (from a single machine to HPC clusters)

Conclusion
Background
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.