Abstract
Short read mapping is a process to align the short reads, which are fixed-length fragments of the target genome, to a given reference genome to identify the mutations in the target genome. Because of the rapid development of Next Generation Sequencing (NGS) technologies, faster short read mapping is required. In this paper, we propose a variable length hash method to further accelerate FPGA short read mapping systems. In the hash-based short read mapping algorithms, a fixed length sub-string of each short read, called seed, is used as the key. However, many different seeds are mapped into the same hash slots because of the high ununiformity of the human genome, and many fruitless key comparisons are performed. To equalize the slot size, we propose an optimized hash function that changes the bit masks adaptively. With this approach, it is possible to improve the performance of all FPGA short read mapping systems based on hash functions. The performance for the comparison in our FPGA system on a Xilinx XC7VX690T and XC6VLX240T can be improved two-times, and the total performance outperforms any existing FPGA systems.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.