A benchmark study on error-correction by read-pairing and tag-clustering in amplicon-based deep sequencing.

Tian-Hao Zhang,Ren Sun,Nicholas C Wu

doi:10.1186/s12864-016-2388-9

Abstract

BackgroundThe high error rate of next generation sequencing (NGS) restricts some of its applications, such as monitoring virus mutations and detecting rare mutations in tumors. There are two commonly employed sequencing library preparation strategies to improve sequencing accuracy by correcting sequencing errors: read-pairing method and tag-clustering method (i.e. primer ID or UID). Here, we constructed a homogeneous library from a single clone, and compared the variant calling accuracy of these error-correction methods.ResultWe comprehensively described the strengths and pitfalls of these methods. We found that both read-pairing and tag-clustering methods significantly decreased sequencing error rate. While the read-pairing method was more effective than the tag-clustering method at correcting insertion and deletion errors, it was not as effective as the tag-clustering method at correcting substitution errors. In addition, we observed that when the read quality was poor, the tag-clustering method led to huge coverage loss. We also tested the effect of applying quality score filtering to the error-correction methods and demonstrated that quality score filtering was able to impose a minor, yet statistically significant improvement to the error-correction methods tested in this study.ConclusionOur study provides a benchmark for researchers to select suitable error-correction methods based on the goal of the experiment by balancing the trade-off between sequencing cost (i.e. sequencing coverage requirement) and detection sensitivity.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-016-2388-9) contains supplementary material, which is available to authorized users.

Highlights

The high error rate of generation sequencing (NGS) restricts some of its applications, such as monitoring virus mutations and detecting rare mutations in tumors
Our study provides a benchmark for researchers to select suitable error-correction methods based on the goal of the experiment by balancing the trade-off between sequencing cost and detection sensitivity
To resolve the problems associated with the high error rate, experimental methods have been developed for distinguishing real mutations from sequencing errors

Summary

Introduction

The high error rate of generation sequencing (NGS) restricts some of its applications, such as monitoring virus mutations and detecting rare mutations in tumors. To resolve the problems associated with the high error rate, experimental methods have been developed for distinguishing real mutations from sequencing errors. One such method is to take advantage of the paired-end feature of Illumina sequencing by removing the inconsistent forward and reverse read pairs [1,2,3,4,5]. Another common approach is to use nucleotide tags [6,7,8,9,10,11,12]. The same tag would be observed in different reads

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Genomics	Publication Date: Feb 12, 2016
Citations: 34	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A benchmark study on error-correction by read-pairing and tag-clustering in amplicon-based deep sequencing.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics

Lead the way for us

Similar Papers

Primer ID Validates Template Sampling Depth and Greatly Reduces the Error Rate of Next-Generation Sequencing of HIV-1 Genomic RNA Populations
Shuntai Zhou ... W I Sundquist
Journal of Virology | VOL. 89
Shuntai Zhou, et. al.Shuntai Zhou ... W I Sundquist
03 Jun 2015
Journal of Virology | VOL. 89

A new method for DNA sequencing error verification and correction via an on-disk index tree
Yarong Gu ... Qiang Zhu
-
Yarong Gu, et. al.Yarong Gu ... Qiang Zhu
09 Sep 2015
09 Sep 2015

A hybrid correcting method considering heterozygous variations by a comprehensive probabilistic model
Jiaqi Liu ... Xiaoyan Zhu
BMC Genomics | VOL. 21
Jiaqi Liu, et. al.Jiaqi Liu ... Xiaoyan Zhu
01 Nov 2020
BMC Genomics | VOL. 21

Novel Methods for Correcting Next Generation Sequencing Errors in the $$\beta $$ Chain of T Cell Receptors
Chrysi Panopoulou ... Nicos Maglaveras
-
Chrysi Panopoulou, et. al.Chrysi Panopoulou ... Nicos Maglaveras
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A benchmark study on error-correction by read-pairing and tag-clustering in amplicon-based deep sequencing.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics