PScL-HDeep: image-based prediction of protein subcellular location in human tissue using ensemble learning of handcrafted and deep learned features with two-layer feature selection.

Matee Ullah,Ke Han,Fazal Hadi,Jian Xu,Jiangning Song,Dong-Jun Yu

doi:10.1093/bib/bbab278

Abstract

Protein subcellular localization plays a crucial role in characterizing the function of proteins and understanding various cellular processes. Therefore, accurate identification of protein subcellular location is an important yet challenging task. Numerous computational methods have been proposed to predict the subcellular location of proteins. However, most existing methods have limited capability in terms of the overall accuracy, time consumption and generalization power. To address these problems, in this study, we developed a novel computational approach based on human protein atlas (HPA) data, referred to as PScL-HDeep, for accurate and efficient image-based prediction of protein subcellular location in human tissues. We extracted different handcrafted and deep learned (by employing pretrained deep learning model) features from different viewpoints of the image. The step-wise discriminant analysis (SDA) algorithm was applied to generate the optimal feature set from each original raw feature set. To further obtain a more informative feature subset, support vector machine-based recursive feature elimination with correlation bias reduction (SVM-RFE + CBR) feature selection algorithm was applied to the integrated feature set. Finally, the classification models, namely support vector machine with radial basis function (SVM-RBF) and support vector machine with linear kernel (SVM-LNR), were learned on the final selected feature set. To evaluate the performance of the proposed method, a new gold standard benchmark training dataset was constructed from the HPA databank. PScL-HDeep achieved the maximum performance on 10-fold cross validation test on this dataset and showed a better efficacy over existing predictors. Furthermore, we also illustrated the generalization ability of the proposed method by conducting a stringent independent validation test.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

PScL-HDeep: image-based prediction of protein subcellular location in human tissue using ensemble learning of handcrafted and deep learned features with two-layer feature selection.

Abstract

Talk to us

Similar Papers

More From: Briefings in bioinformatics

Lead the way for us

Journal: Briefings in bioinformatics	Publication Date: Jul 30, 2021
Citations: 30

Similar Papers

Identifying Protein Subcellular Location with Embedding Features Learned from Networks
Hongwei Liu ... Bin Hu
Current Proteomics | VOL. 18
Hongwei Liu, et. al.Hongwei Liu ... Bin Hu
23 Nov 2021
Current Proteomics | VOL. 18

Using Nearest Feature Line and Tunable Nearest Neighbor methods for prediction of protein subcellular locations
Qing-Bin Gao ... Zheng-Zhi Wang
Computational Biology and Chemistry | VOL. 29
Qing-Bin Gao, et. al.Qing-Bin Gao ... Zheng-Zhi Wang
01 Oct 2005
Computational Biology and Chemistry | VOL. 29

Prediction of human protein subcellular localization using deep learning
Leyi Wei ... Quan Zou
Journal of Parallel and Distributed Computing | VOL. 117
Leyi Wei, et. al.Leyi Wei ... Quan Zou
24 Aug 2017
Journal of Parallel and Distributed Computing | VOL. 117

A novel ensemble approach to prediction of protein subcellular location
Chen Yue-Hui ... Ma Bing-Xian
-
Chen Yue-Hui, et. al. Chen Yue-Hui ... Ma Bing-Xian
01 Oct 2010
01 Oct 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PScL-HDeep: image-based prediction of protein subcellular location in human tissue using ensemble learning of handcrafted and deep learned features with two-layer feature selection.

Abstract

Talk to us

Similar Papers

More From: Briefings in bioinformatics