Abstract

SAM/BAM alignment file formats are extensively used in virtually all the laboratories devoted to high-throughput sequencing. However, limited effort has been yet dedicated to the development of SAM/BAM quality reporting tools. To overcome this problem, we developed SAM-Profiler, a multiplatform tool dedicated to the advanced quality reporting of SAM/BAM files. SAM-Profiler performs qualitative analysis of SAM/BAM alignment data in the context of next-generation sequencing. It is implemented in C# and can be run under Windows, Linux and MacOS operative systems. Two versions are available: fully graphical, event-driven software and a command-line tool. SAM-Profiler is able to generate an extensive set of qualitative reports on SAM/BAM alignment data, among them: overall, per-base and per-chromosome read quality, mapping quality, duplicate and coverage analyses, bases distribution, perfect, proper and improper mapping, exonic, intronic, intergenic, 5` and 3` UTR coverage, mismatch distribution profile and CG distribution. In presence of paired-end sequencing experiments our tool is able to automatically report the insert size distribution and to analyze the relative pair mapping, reporting absolute and relative distribution of properly, improperly mapped, mapped/unmapped and unmapped pairs. Its modular architecture allows embedding additional analytical monitoring/reporting tools to the already developed list, allowing SAM-Profiler to grow according to the specific requests of the end-users.

Highlights

  • The development of high-throughput sequencing instruments generated a tremendous amount of data from different sources: from viral and bacterial de novo genomes to human resequencing projects, such as the ambitious 1000 genomes project [1]

  • To overcome the limited availability of SAM/BAM reporting tools, we developed SAM-Profiler, a bioinformatics tool dedicated to the advanced quality reporting of SAM and BAM files available as fully graphical, event-driven software and as a command-line tool

  • Bone Marrow (BM) or Peripheral Blood (PB) samples from atypical chronic myeloid leukemia patients were collected at diagnosis, after obtaining written informed consent approved by the local ethics committee [5]

Read more

Summary

Introduction

The development of high-throughput sequencing instruments generated a tremendous amount of data from different sources: from viral and bacterial de novo genomes to human resequencing projects, such as the ambitious 1000 genomes project [1]. To demonstrate how SAM-Profiler can be used to generate quality reports (Figure 1) from generation datasets, we analyzed a set of 13 paired-end Whole-Exome Sequencing (WES) BAM files from our recently published high-throughput sequencing study [5].

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.