Abstract
Multi-label classification methods are important in various fields,such as protein type,protein function, semantic scene classification and music categorization . In multi-label classification, each sample can be associated with a set of class labels. In protein type classification, one of the major types of protein is membrane protein. The Membrane proteins are performing different cellular processes and important functions, which are based on the protein types. Each membrane protein have different rolls at the same time. In this study we proposes membrane protein type classification using Decision Tree (DT) classification algorithm. The DT classifies a membrane protein into six types . An essential set of features are extracted from the membrane protein dataset S1 which are used for the proposed method,and it was revealed an accuracy of 69.81%, whereas existing methods network based and shortest path revealed an accuracy of 66.78%,54.97%.The accuracy got in the existing methods are not for the full set of protein in dataset S1, but it is achieved after removal of few unannotated protein. Both accuracy wise and complexity wise, the proposed method seems to be better than the existing method
Highlights
Multilabel classification methods are progres- sively used in recent research works, protein function,protein type,semantic scene classification and music categorization.A general form of multi class classification is Multi-label classification.It is single-label problem of grouping instances into one of more than two classes.The main feature of multilabel problem is that the instance can be assigned to any number of classes
We proposes a multi label classification of different types of membrane proteins by implementing Decision Tree (DT) classifier algorithm
The proposed DT classification Results are shown in the Table
Summary
Multilabel classification methods are progres- sively used in recent research works, protein function,protein type,semantic scene classification and music categorization.A general form of multi class classification is Multi-label classification.It is single-label problem of grouping instances into one of more than two classes.The main feature of multilabel problem is that the instance can be assigned to any number of classes. They proposed an integrated approach to predict multiple types of membrane proteins by employing sequence homology and proteinprotein interaction network[22] According to their positions and intramolecular arrangements in a cell, membrane proteins are classified into six types : (1) GPI (Glycosylphosphatidylinisotol) - anchor; (2)Lipid-anchor(LCM); (3) Multi-pass(MPT); (4) Peripheral(PM); (5)Single-pass type I; (6)Singlepass type II membrane proteins shown in Fig.[3]. To evaluate the performance of the prediction method, W.Li et al[23] use the sequence clustering pro- gram CDHIT(Cluster Database at Height Identity Tolerance)[24] to prepare the benchmark set of data S1 from 3789,containing 2935 proteins sequences with sequence similarity less than 70% .In our proposed method we use the dataset S1(2935 proteins) used for classification. Hydrophobic amino acids are likely to be found in the interior, whereas hydrophilic amino acids are likely to be in contact with the aqueous environment
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have