Abstract

In this paper, an effective dimension reduction approach called semi-supervised discriminant analysis (SDA) is employed to deal with the protein subcellular localization problem. Firstly, a novel protein sequence encoding method that consists of pseudo amino acid composition (PseAAC) and dipeptide composition (DC) is introduced to represent a protein. Secondly, the SDA algorithm is applied to extract the essential discriminant features from the combined feature data set consisting of PseAAC and DC. Finally, the K-nearest neighbor (K-NN) classifier is used to identify the subcellular localization of Gram-positive bacterial proteins. The proposed method can effective utilize both manifold information and the class information of the protein samples to guide the produce of protein subcellular localization. To evaluate the prediction performance of the proposed algorithm, a jackknife test based on nearest neighbor algorithm is employed on the gram-negative bacterial proteins data set. The results show that we can get a high total accuracy in a low-dimensional feature space, which indicates that the proposed approach is effective and practical.

Highlights

  • Gram-negative bacteria are a class of bacteria that do not retain crystal violet dye in the Gram staining protocol

  • Among the three test methods, the jackknife test is the mostly used due to that it can always yield a unique result for a given benchmark dataset

  • A novel nonlinear dimension reduction method named supervised discriminant analysis (SDA) is utilized for membrane protein type prediction

Read more

Summary

Introduction

Gram-negative bacteria are a class of bacteria that do not retain crystal violet dye in the Gram staining protocol. Many gram-negative bacteria can cause disease in a host organism. The reliable subcellular localization of a gramnegative bacteria protein based on its sequence information can provide valuable information about its function and is helpful for drug development. It is important to develop methods for accurately predicting protein subcellular localization gram-negative bacteria proteins. A number of effective computational approaches have been presented for protein subcellular localization prediction[1,2,3]. Predict protein subcellular localization in an automatic fashion accurately is remain a challenge

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call