Identification of antioxidant proteins using a discriminative intelligent model of k-space amino acid pairs based descriptors incorporating with ensemble feature selection

Ashfaq Ahmad,Shahid Akbar,Maqsood Hayat,Farman Ali,Salman Khan,Mohammad Sohail

doi:10.1016/j.bbe.2020.10.003

Abstract

Antioxidant proteins have been discovered closely associated with disease control due to its capability to eradicate excess free radicals. The accurate identification of antioxidant proteins is on the upsurge owing to their therapeutic significance. However, observing the rapid increases of this toxic disease in the human body, several machine learning algorithms have been applied and performed inadequately to identify antioxidant proteins. Therefore, measuring the effectiveness of antioxidant proteins on the human body, a reliable intelligent model is indispensable for the researchers. In this study, primary protein sequences are formulated using evolutionary and sequence-based numerical descriptors. Whereas, evolutionary features are collected using a bigram Position-specific scoring matrix, besides, K-space amino acid pair (KSAAP) and dipeptide composition are utilized to extract sequential information. Furthermore, in order to reduce the computational time and to eradicate irreverent and noisy features, the Sequential forward selection and Support vector machine (SFS-SVM) based ensemble approach is applied to select optimal features. At last, several distinct nature classification learning methods are applied to choose a suitable operational engine for our model. After evaluating the empirical results, SVM using optimal features achieved an accuracy of 97.54%, 93.71% using the training and independent dataset, respectively. It was found that our proposed model outperformed and reported the highest performance than the existing computational models. It is expected that the developed model may be played a useful role in research academia as well as proteomics and drug development. The source code and all datasets are publicly available at https://github.com/salman-khan-mrd/Antioxident_proteins.

Full Text