Abstract

Short read mapping is a process to align the short reads, which are fixed-length fragments of the target genome, to a given reference genome to identify the mutations in the target genome. Because of the rapid development of Next Generation Sequencing (NGS) technologies, faster short read mapping is required. In this paper, we propose a variable length hash method to further accelerate FPGA short read mapping systems. In the hash-based short read mapping algorithms, a fixed length sub-string of each short read, called seed, is used as the key. However, many different seeds are mapped into the same hash slots because of the high ununiformity of the human genome, and many fruitless key comparisons are performed. To equalize the slot size, we propose an optimized hash function that changes the bit masks adaptively. With this approach, it is possible to improve the performance of all FPGA short read mapping systems based on hash functions. The performance for the comparison in our FPGA system on a Xilinx XC7VX690T and XC6VLX240T can be improved two-times, and the total performance outperforms any existing FPGA systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call