Abstract

DNA metabarcoding is broadly used in biodiversity studies encompassing a wide range of organisms. Erroneous amplicons, generated during amplification and sequencing procedures, constitute one of the major sources of concern for the interpretation of metabarcoding results. Several denoising programs have been implemented to detect and eliminate these errors. However, almost all denoising software currently available has been designed to process non-coding ribosomal sequences, most notably prokaryotic 16S rDNA. The growing number of metabarcoding studies using coding markers such as COI or RuBisCO demands a re-assessment and calibration of denoising algorithms. Here we present DnoisE, the first denoising program designed to detect erroneous reads and merge them with the correct ones using information from the natural variability (entropy) associated to each codon position in coding barcodes. We have developed an open-source software using a modified version of the UNOISE algorithm. DnoisE implements different merging procedures as options, and can incorporate codon entropy information either retrieved from the data or supplied by the user. In addition, the algorithm of DnoisE is parallelizable, greatly reducing runtimes on computer clusters. Our program also allows different input file formats, so it can be readily incorporated into existing metabarcoding pipelines.

Highlights

  • Biodiversity studies have experienced a revolution in the last decade with the application of high throughput sequencing (HTS) techniques

  • Many of these studies have direct implications on management and conservation of ecosystems and are providing direct benefits to society. They have brought to light a bewildering diversity of organisms in habitats difficult to study with traditional techniques

  • DnoisE is a novel denoising program that can be incorporated into any metabarcoding pipeline

Read more

Summary

Introduction

Biodiversity studies have experienced a revolution in the last decade with the application of high throughput sequencing (HTS) techniques. We compared the run speed of DnoisE with and without entropy correction for the same dataset of sequences.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call