Abstract

BackgroundIllumina sequencing platform is widely used in genome research. Sequence reads quality assessment and control are needed for downstream analysis. However, software that provides efficient quality assessment and versatile filtration methods is still lacking.ResultsWe have developed a toolkit named HTQC – abbreviation of High-Throughput Quality Control – for sequence reads quality control, which consists of six programs for reads quality assessment, reads filtration and generation of graphic reports.ConclusionsThe HTQC toolkit can generate reads quality assessment faster than existing tools, providing guidance for reads filtration utilities that allow users to choose different strategies to remove low quality reads.

Highlights

  • Illumina sequencing platform is widely used in genome research

  • The device performs sequencing by DNA synthesis on clusters of identical DNA molecules simultaneously

  • Workflow demonstration To demonstrate the function of HTQC, a paired-end sequence data of human gut metagenome was used as an example

Read more

Summary

Background

Generation sequencing technologies are generating massive sequence data [1], and different platforms can introduce varied level of sequence reads error. To find tile-specific problems like high error rate or low data production, a stacked bar chart shows the number of reads in different quality ranges using different color, each tile in one bar (Figure 1F). For paired-end reads quality assessment, the ht_stat program is used to create separate charts for each end, and to calculate the correlation between reads quality of two ends (Figure 1G). All these results generated by ht_stat program are written to a series of tab-delimited plain-text files, which can be visualized using ht_stat_draw.pl, A. or any spreadsheet software like Microsoft Excel or LibreOffice Calc. The HTQC tool kit provides four different programs that include ht_tile_filter, ht_trim, ht_qual_filter and ht_length_filter, to perform reads filtration. The cutoff value of these programs, such as the thresholds on the minimum reads quality or minimum read length are user defined

Results and discussion
Conclusions
Metzker ML
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.