Abstract

Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew’s Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc.

Highlights

  • ROS (Reactive Oxygen Species) are generated in aerobic metabolic processes as a result of endogenous and exogenous factors, such as air pollutants and cigarette smoke [1]

  • RF obtains balanced performance with an Sn of 0.9 and an Sp of 0.89. These results demonstrate that RF is relatively effective in antioxidant protein identification

  • To obtain the optimal ensemble classifier for antioxidant protein identification, we evaluate the accuracy of multiple individual classifiers and get a classifier list ranked by accuracy

Read more

Summary

Introduction

ROS (Reactive Oxygen Species) are generated in aerobic metabolic processes as a result of endogenous and exogenous factors, such as air pollutants and cigarette smoke [1]. Moderate concentrations of ROS can function in physiological oxidative processes of cells [2, 3], PLOS ONE | DOI:10.1371/journal.pone.0163274. An Effective Antioxidant Protein Predictor analysis, decision to publish, or preparation of the manuscript Moderate concentrations of ROS can function in physiological oxidative processes of cells [2, 3], PLOS ONE | DOI:10.1371/journal.pone.0163274 September 23, 2016

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call