Fast_protein_cluster: parallel and optimized clustering of large-scale protein modeling data.

Ling-Hong Hung,Ram Samudrala

doi:10.1093/bioinformatics/btu098

Ling-Hong Hung, Ram Samudrala

Open Access

https://doi.org/10.1093/bioinformatics/btu098

Copy DOI

Journal: Bioinformatics (Oxford, England)	Publication Date: Feb 14, 2014
Citations: 17	License type: CC BY 3.0

Affiliation: University of Washington

Abstract

Motivation: fast_protein_cluster is a fast, parallel and memory efficient package used to cluster 60 000 sets of protein models (with up to 550 000 models per set) generated by the Nutritious Rice for the World project.Results: fast_protein_cluster is an optimized and extensible toolkit that supports Root Mean Square Deviation after optimal superposition (RMSD) and Template Modeling score (TM-score) as metrics. RMSD calculations using a laptop CPU are 60× faster than qcprot and 3× faster than current graphics processing unit (GPU) implementations. New GPU code further increases the speed of RMSD and TM-score calculations. fast_protein_cluster provides novel k-means and hierarchical clustering methods that are up to 250× and 2000× faster, respectively, than Clusco, and identify significantly more accurate models than Spicker and Clusco.Availability and implementation: fast_protein_cluster is written in C++ using OpenMP for multi-threading support. Custom streaming Single Instruction Multiple Data (SIMD) extensions and advanced vector extension intrinsics code accelerate CPU calculations, and OpenCL kernels support AMD and Nvidia GPUs. fast_protein_cluster is available under the M.I.T. license. (http://software.compbio.washington.edu/fast_protein_cluster)Contact: lhhung@compbio.washington.eduSupplementary information: Supplementary data are available at Bioinformatics online.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Fast_protein_cluster: parallel and optimized clustering of large-scale protein modeling data.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics (Oxford, England)

Lead the way for us

Similar Papers

Portability for GPU-accelerated molecular docking applications for cloud and HPC: can portable compiler directives provide performance across all platforms?
Mathialakan Thavappiragasam ... Wael Elwasif
-
Mathialakan Thavappiragasam, et. al.Mathialakan Thavappiragasam ... Wael Elwasif
01 May 2022
01 May 2022

GPU-accelerated multitiered iterative phasing algorithm for fluctuation X-ray scattering.
Pranay Reddy Kommera ... Petrus H Zwart
Journal of applied crystallography | VOL. 54
Pranay Reddy Kommera, et. al.Pranay Reddy Kommera ... Petrus H Zwart
30 Jul 2021
Journal of applied crystallography | VOL. 54

An Automated Tool for Analysis and Tuning of GPU-Accelerated Code in HPC Applications
Keren Zhou ... John Mellor-Crummey
IEEE Transactions on Parallel and Distributed Systems | VOL. 33
Keren Zhou, et. al.Keren Zhou ... John Mellor-Crummey
01 Apr 2022
IEEE Transactions on Parallel and Distributed Systems | VOL. 33

GPU implementation of a parallel two‐list algorithm for the subset‐sum problem
Lanjun Wan ... Jing Liu
Concurrency and Computation: Practice and Experience | VOL. 27
Lanjun Wan, et. al.Lanjun Wan ... Jing Liu
09 Jan 2014
Concurrency and Computation: Practice and Experience | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fast_protein_cluster: parallel and optimized clustering of large-scale protein modeling data.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics (Oxford, England)