Abstract

MotivationCRAM has established itself as a high compression alternative to the BAM file format for DNA sequencing data. We describe updates to further improve this on modern sequencing instruments.ResultsWith Illumina data CRAM 3.1 is 7–15% smaller than the equivalent CRAM 3.0 file, and 50–70% smaller than the corresponding BAM file. Long-read technology shows more modest compression due to the presence of high-entropy signals.Availability and implementationThe CRAM 3.0 specification is freely available from https://samtools.github.io/hts-specs/CRAMv3.pdf. The CRAM 3.1 improvements are available in a separate OpenSource HTScodecs library from https://github.com/samtools/htscodecs, and have been incorporated into HTSlib.Supplementary information Supplementary data are available at Bioinformatics online.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.