Abstract

The genomics landscape has undergone a revolutionary transformation with the emergence of third-generation sequencing technologies. Fueled by the exponential surge in sequencing data, there is an urgent demand for accurate and rapid algorithms to effectively handle this burgeoning influx. Under such circumstances, we developed a parallelized, yet accuracy-lossless algorithm for maximal exact match (MEM) retrieval to strategically address the computational bottleneck of uLTRA, a leading spliced alignment algorithm known for its precision in handling long RNA sequencing (RNA-seq) reads. The design of the algorithm incorporates a multi-threaded strategy, enabling the concurrent processing of multiple reads simultaneously. Additionally, we implemented the serialization of index required for MEM retrieval to facilitate its reuse, resulting in accelerated startup for practical tasks. Extensive experiments demonstrate that our parallel algorithm achieves significant improvements in runtime, speedup, throughput, and memory usage. When applied to the largest human dataset, the algorithm achieves an impressive speedup of 10.78 × , significantly improving throughput on a large scale. Moreover, the integration of the parallel MEM retrieval algorithm into the uLTRA pipeline introduces a dual-layered parallel capability, consistently yielding a speedup of 4.99 × compared to the multi-process and single-threaded execution of uLTRA. The thorough analysis of experimental results underscores the adept utilization of parallel processing capabilities and its advantageous performance in handling large datasets. This study provides a showcase of parallelized strategies for MEM retrieval within the context of spliced alignment algorithm, effectively facilitating the process of RNA-seq data analysis. The code is available at https://github.com/RongxingWong/AcceleratingSplicedAlignment.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.