Abstract

Protein trafficking or protein sorting is the mechanism by which a cell transports proteins to the appropriate position in the cell or outside of it. This targeting is based on the information contained in the protein. Many methods predict the subcellular location of proteins in eukaryotes from the sequence information. However, most of these methods use a flat structure to perform prediction. In this work, we introduce ensemble methods to predict locations in the eukaryotic protein-sorting non membrane pathway hierarchically. We used features that were extracted exclusively from full length protein sequences with feature subset selection for classification. Sequence driven features, sequence mapped features and sequence autocorrelation features were tested with ensemble learners and classifier performances were compared with and without feature subset selection technique. This study shows the new features extracted from full length eukaryotic protein sequences are effective at capturing biological features among compartments in eukaryotic non membrane pathways at two levels. Feature subset selection techniques helped to reduce the time taken for building the classification model.

Highlights

  • Eukaryotic cells are organized into several membrane bound compartments

  • The final 1677 protein sequences were represented in two groups; by combining the three different sequence features with and without feature subset selection

  • 5 fold crossvalidation test and independent data test were performed on these two feature groups to evaluate the quality of the classifier

Read more

Summary

Introduction

In order to perform the function; newly formed proteins get sorted and are delivered to various compartments in the non-membrane and trans-membrane pathways [1]. This protein sorting process in the pathway is very complex and still not clearly understood. In 1983, Nishikawa, Kubota and Ooi had conducted investigations into predicting subcellular locations based on amino acid compositions. They had reported that the amino acid compositions have the discriminating ability to classify subcellular locations

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call