High-Performance Deep Learning Toolbox for Genome-Scale Prediction of Protein Structure and Function.

Mu Gao,Xiao Chen,Chen Chen,Raj S Roy,Ryan Prout,Sajid Mahmud,Wael Elwasif,Ada Sedova,T Chad Effler,Alex Morehead,Farhan Quadir,Subil Abraham,Jianlin Cheng,Nabin Giri,Peik Lund-Andersen,Jeffrey Skolnick,N Quentin Haas

doi:10.1109/mlhpc54614.2021.00010

Mu Gao, Xiao Chen + Show 15 more

Open Access

https://doi.org/10.1109/mlhpc54614.2021.00010

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Computational biology is one of many scientific disciplines ripe for innovation and acceleration with the advent of high-performance computing (HPC). In recent years, the field of machine learning has also seen significant benefits from adopting HPC practices. In this work, we present a novel HPC pipeline that incorporates various machine-learning approaches for structure-based functional annotation of proteins on the scale of whole genomes. Our pipeline makes extensive use of deep learning and provides computational insights into best practices for training advanced deep-learning models for high-throughput data such as proteomics data. We showcase methodologies our pipeline currently supports and detail future tasks for our pipeline to envelop, including large-scale sequence comparison using SAdLSA and prediction of protein tertiary structures using AlphaFold2.

Full Text