Abstract

Ongoing developments and cost decreases in next-generation sequencing (NGS) technologies have led to an increase in their application, which has greatly enhanced the fields of genetics and genomics. Mapping sequence reads onto a reference genome is a fundamental step in the analysis of NGS data. Efficient alignment of the reads onto the reference genome with high accuracy is very important because it determines the global quality of downstream analyses. In this study, we evaluate the performance of three Burrows-Wheeler transform-based mappers, BWA, Bowtie2, and HISAT2, in the context of paired-end Illumina whole-genome sequencing of livestock, using simulated sequence data sets with varying sequence read lengths, insert sizes, and levels of genomic coverage, as well as five real data sets. The mappers were evaluated based on two criteria, computational resource/time requirements and robustness of mapping. Our results show that BWA and Bowtie2 tend to be more robust than HISAT2, while HISAT2 was significantly faster and used less memory than both BWA and Bowtie2. We conclude that there is not a single mapper that is ideal in all scenarios but rather the choice of alignment tool should be driven by the application and sequencing technology.

Highlights

  • Ongoing developments and cost decreases in next-generation sequencing (NGS) technologies have led to an increase in their application, which has greatly enhanced the fields of genetics and genomics

  • Efficient alignment of reads onto the reference genome with high accuracy is very important because it determines the global quality of downstream analyses

  • Mapping algorithms can largely be grouped into two categories based on properties of their indices: algorithms based on hash tables, and algorithms based on the Burrows-Wheeler transform (BWT; Li and Homer, 2010)

Read more

Summary

INTRODUCTION

Ongoing developments and cost decreases in next-generation sequencing (NGS) technologies have led to an increase in their application, which has greatly enhanced the fields of genetics and genomics. Mapping algorithms can largely be grouped into two categories based on properties of their indices: algorithms based on hash tables, and algorithms based on the Burrows-Wheeler transform (BWT; Li and Homer, 2010) Due to their computational efficiency, BWT-based algorithms have become increasingly popular (Zhang et al, 2013). The Burrows-Wheeler Aligner (BWA; Li and Durbin, 2009) and Bowtie (Langmead and Salzberg, 2012) have been utilized in a large number of livestock studies We tested these two mappers and HISAT2 (Kim et al, 2015), a newly released software, using simulated sequence data sets with varying sequence read lengths, insert sizes, and levels of genomic coverage. This is the first evaluation of HISAT2 applied to whole-genome sequence data

MATERIALS AND METHODS
RESULTS AND DISCUSSION
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call