CATE: A fast and scalable CUDA implementation to conduct highly parallelized evolutionary tests on large scale genomic data

Deshan Perera,Elsa Reisenhofer,Christian D. Huber,Eve Higgins,Said Hussein,Quan Long

doi:10.1111/2041-210x.14168

Deshan Perera, Elsa Reisenhofer + Show 4 more

Open Access

https://doi.org/10.1111/2041-210x.14168

Copy DOI

Abstract

Abstract Statistical tests for molecular evolution provide quantifiable insights into the selection pressures that govern a genome's evolution. Increasing sample sizes used for analysis leads to higher statistical power. However, this requires more computational nodes or longer computational time. CATE (CUDA Accelerated Testing of Evolution) is a computational solution to this problem comprised of two main innovations. The first is a file organization system coupled with a novel search algorithm and the second is a large‐scale parallelization of algorithms using both graphical processing unit (GPU) and central processing unit. CATE is capable of conducting evolutionary tests such as Tajima's D, Fu and Li's, and Fay and Wu's test statistics, McDonald–Kreitman Neutrality Index, Fixation Index and Extended Haplotype Homozygosity. CATE is magnitudes faster than standard tools with benchmarks estimating it being on average over 180 times faster. For instance, CATE processes all 54,849 human genes for all 22 autosomal chromosomes across the five super populations present in the 1000 Genomes Project in less than 30 min while counterpart software took 3.62 days. This proven framework has the potential to be adapted for GPU‐accelerated large‐scale parallel analyses of many evolutionary and genomic analyses.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Methods in Ecology and Evolution	Publication Date: Jun 29, 2023
Citations: 1	License type: CC BY-NC-ND 4.0

R Discovery Prime

R Discovery Prime

CATE: A fast and scalable CUDA implementation to conduct highly parallelized evolutionary tests on large scale genomic data

Abstract

Talk to us

Similar Papers

More From: Methods in Ecology and Evolution

Lead the way for us

Similar Papers

Dynamic Heterogeneous scheduling of GPU-CPU in Distributed Environment
Suman Goyat ... Neha Dhariwal
-
Suman Goyat, et. al.Suman Goyat ... Neha Dhariwal
01 Nov 2019
01 Nov 2019

Reduction of computing time for seismic applications based on the Helmholtz equation by Graphics Processing Units

-

03 Mar 2015
03 Mar 2015

NUMERICAL IMPLEMENTATION OF A PARALLEL ALGORITHM FOR SOLVING THE PROBLEM OF POLLUTANT TRANSPORT IN A RESERVOIR ON A HIGH-PERFORMANCE COMPUTER SYSTEM
A V Nikitina ... A M Atayan
Vestnik komp'iuternykh i informatsionnykh tekhnologii | VOL. -
A V Nikitina, et. al.A V Nikitina ... A M Atayan
01 Apr 2021
Vestnik komp'iuternykh i informatsionnykh tekhnologii | VOL. -

Efficient Utilization of a CPU-GPU Cluster
Gopal Patnaik ... Andrew Corrigan
-
Gopal Patnaik, et. al.Gopal Patnaik ... Andrew Corrigan
09 Jan 2012
09 Jan 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CATE: A fast and scalable CUDA implementation to conduct highly parallelized evolutionary tests on large scale genomic data

Abstract

Talk to us

Similar Papers

More From: Methods in Ecology and Evolution