Abstract

Sequencing technology has advanced rapidly. Millions to billions of short reads are sequenced from a DNA molecule in a single run by parallelizing the whole procedure. Since it is a very cost effective procedure and can be performed in a laboratory environment within a brief period of time, we see an explosion of the biological sequencing data. But there is a tradeoff between the abundance and accuracy of the sequencing reads. The limitations of the sequencing technology result in errors in the reads. The errors could be substitution(s), insertions and/or deletions in a single base or multiple bases. Although the errors are being greatly reduced with the advancement of the modern technology, it is still a serious concern as of today. The sequence assembler often fails to sequence the entire genome because of the errors in the reads. By identifying and correcting the erroneous bases of the reads, not only can we achieve high quality data but also the computational complexity of many biological applications can be greatly reduced. Traditional approaches employ overlaps among the reads to correct them. Biologists have successfully sequenced thousands of species and this effort is growing continuously. As a result, the list of species for which references are available is growing rapidly. Considering this fact we have developed a novel hybrid error correcting algorithm called HECTOR (Hybrid Error CorrecTOR). It employs both referential and de novo error correction techniques to correct errors in reads. We have done extensive experiments to reveal that HECTOR is indeed an effective error correction algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.