Abstract

Membrane proteins are vital type of proteins that serve as channels, receptors and energy transducers in a cell. They perform various important functions, which are mainly associated with their types. They are also attractive targets of drug discovery for various diseases. So predicting membrane protein types is a crucial and challenging research area in bioinformatics and proteomics. Because of vast investigation of uncharacterized protein sequences in databases, customary biophysical techniques are extremely tedious, costly and vulnerable to mistakes. Subsequently, it is very attractive to build a vigorous, solid, proficient technique to predict membrane protein types. In this work, a novel feature set Exchange Group Based Protein Sequence Representation (EGBPSR) is proposed for classification of membrane proteins with two new feature extraction strategies known as Exchange Group Local Pattern (EGLP) and Amino acid Interval Pattern (AIP). Imbalanced dataset and large dataset are often handled well by decision tree classifiers. Since imbalanced dataset are taken, the performance of various decision tree classifiers such as Decision Tree (DT), Classification and Regression Tree (CART), ensemble methods such as Adaboost, Random Under Sampling (RUS) boost, Rotation forest and Random forest are analyzed. The overall accuracy achieved in predicting membrane protein types is 96.45%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call