Abstract

Background454 sequencing is currently the method of choice for sequencing of antibody repertoires and libraries containing large numbers (106 to 1012) of different molecules with similar frameworks and variable regions which poses significant challenges for identifying sequencing errors. Identification and correction of sequencing errors in such mixtures is especially important for the exploration of complex maturation pathways and identification of putative germline predecessors of highly somatically mutated antibodies. To quantify and correct errors incorporated in 454 antibody sequencing, we sequenced six antibodies at different known concentrations twice over and compared them with the corresponding known sequences as determined by standard Sanger sequencing.ResultsWe found that 454 antibody sequencing could lead to approximately 20% incorrect reads due to insertions that were mostly found at shorter homopolymer regions of 2-3 nucleotide length, and less so by insertions, deletions and other variants at random sites. Correction of errors might reduce this population of erroneous reads down to 5-10%. However, there are a certain number of errors accounting for 4-8% of the total reads that could not be corrected unless several repeated sequencing is performed, although this may not be possible for large diverse libraries and repertoires including complete sets of antibodies (antibodyomes).ConclusionsThe experimental test procedure carried out for assessing 454 antibody sequencing errors reveals high (up to 20%) incorrect reads; the errors can be reduced down to 5-10% but not less which suggests the use of caution to avoid false discovery of antibody variants and diversity.

Highlights

  • 454 sequencing is currently the method of choice for sequencing of antibody repertoires and libraries containing large numbers (106 to 1012) of different molecules with similar frameworks and variable regions which poses significant challenges for identifying sequencing errors

  • To characterize and correct the 454 antibody sequencing errors, we performed the 454 sequencing of six clonallyrelated antibodies of known sequences at different concentrations twice over and calculated the error rates by comparing them to the results obtained from standard Sanger sequencing

  • The high-throughput 454 sequence data obtained for the six antibodies were compared with their known sequences using the pair-wise sequence comparison by the BLAST method. This helped us to identify accurate reads as well as erroneous reads where we observed different types of errors and their frequencies such as point mutation due to insertion, deletion or substitution, and errors involving two or more than two nucleotides. These results provided an assessment of types and frequencies of errors observed in 454 antibody sequencing of six antibodies #1-6 which were 3-fold serially diluted

Read more

Summary

Introduction

454 sequencing is currently the method of choice for sequencing of antibody repertoires and libraries containing large numbers (106 to 1012) of different molecules with similar frameworks and variable regions which poses significant challenges for identifying sequencing errors. To quantify and correct errors incorporated in 454 antibody sequencing, we sequenced six antibodies at different known concentrations twice over and compared them with the corresponding known sequences as determined by standard Sanger sequencing. We performed 454 sequence analyses of six different antibodies at varied concentrations twice over and compared the reads with the original sequences determined by standard Sanger sequencing. This allowed us to identify the types of errors and estimate error rates, and suggest corrections applicable to 454 antibody sequencing for better confidence in the assessment of data quality

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.