Abstract

BackgroundRecent advances in next-generation sequencing have revolutionized genomic research. 16S rRNA amplicon sequencing using paired-end sequencing on the MiSeq platform from Illumina, Inc., is being used to characterize the composition and dynamics of extremely complex/diverse microbial communities. For this analysis on the Illumina platform, merging and quality filtering of paired-end reads are essential first steps in data analysis to ensure the accuracy and reliability of downstream analysis.ResultsWe have developed the Merging and Filtering Tool (MeFiT) to combine these pre-processing steps into one simple, intuitive pipeline. MeFiT invokes CASPER (context-aware scheme for paired-end reads) for merging paired-end reads and provides users the option to quality filter the reads using the traditional average Q-score metric or using a maximum expected error cut-off threshold.ConclusionsMeFiT provides an open-source solution that permits users to merge and filter paired end illumina reads. The tool has been implemented in python and the source-code is freely available at https://github.com/nisheth/MeFiT.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1358-1) contains supplementary material, which is available to authorized users.

Highlights

  • Recent advances in next-generation sequencing have revolutionized genomic research. 16S Small subunit ribosomal Ribonucleic ACid (RNA) (rRNA) amplicon sequencing using paired-end sequencing on the MiSeq platform from Illumina, Inc., is being used to characterize the composition and dynamics of extremely complex/diverse microbial communities

  • Depending on the read length selected, currently up to 300 bases, a target deoxyribonucleic acid (DNA) segment that is longer than the sum of the forward and reverse reads would result in a gap of missing sequence between them, and a shorter target segment will result in an overlap between the reads

  • Since quality tends to degrade towards the ends of the reads, reliable merging of overlapping paired-end reads can results in a combined DNA sequence that might permit bioinformatics correction of these 3′-end sequencing errors and yield higher quality sequence output

Read more

Summary

Results

We have developed the Merging and Filtering Tool (MeFiT) to combine these pre-processing steps into one simple, intuitive pipeline. MeFiT invokes CASPER (context-aware scheme for paired-end reads) for merging paired-end reads and provides users the option to quality filter the reads using the traditional average Q-score metric or using a maximum expected error cut-off threshold

Conclusions
Background
Results and discussion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call