Abstract

For many parallel applications of Next-Generation Sequencing (NGS) technologies short barcodes able to accurately multiplex a large number of samples are demanded. To address these competitive requirements, the use of error-correcting codes is advised. Current barcoding systems are mostly built from short random error-correcting codes, a feature that strongly limits their multiplexing accuracy and experimental scalability. To overcome these problems on sequencing systems impaired by mismatch errors, the alternative use of binary BCH and pseudo-quaternary Hamming codes has been proposed. However, these codes either fail to provide a fine-scale with regard to size of barcodes (BCH) or have intrinsic poor error correcting abilities (Hamming). Here, the design of barcodes from shortened binary BCH codes and quaternary Low Density Parity Check (LDPC) codes is introduced. Simulation results show that although accurate barcoding systems of high multiplexing capacity can be obtained with any of these codes, using quaternary LDPC codes may be particularly advantageous due to the lower rates of read losses and undetected sample misidentification errors. Even at mismatch error rates of 10−2 per base, 24-nt LDPC barcodes can be used to multiplex roughly 2000 samples with a sample misidentification error rate in the order of 10−9 at the expense of a rate of read losses just in the order of 10−6.

Highlights

  • Molecular barcoding provides the opportunity to multiplex next-generation sequencing [1] capacity across multiple individuals at specific portions of the genomes [2, 3]

  • To improve the design flexibility accomplished with shortened BCH barcodes, we further introduce Low Density Parity Check (LDPC) barcodes, a class of barcodes built from quaternary LDPC codes [29]

  • BCH barcodes of size N 25 constrained to accomplish M ! 24 and pu 10−8 over a Quaternary Symmetric Channel (QSC) model where mismatch errors occur with probability ps = 10−2

Read more

Summary

Introduction

Molecular barcoding provides the opportunity to multiplex next-generation sequencing [1] capacity across multiple individuals at specific portions of the genomes [2, 3]. Molecular barcoding lays on the ability of rather short oligos, known as barcodes, to tag DNA fragments belonging to different samples. Barcodes, which can be deployed either as part of adapters [5,6,7] or amplification primers [2, 4, 8], are expected to simultaneously offer negligible interference with DNA sequencing reactions, high resilience against sequencing errors and high multiplexing capacity. Large sets of random DNA sequences of size N are first screened to ensure the satisfiability of chemistry

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.