Prediction of Self-Interacting Proteins from Protein Sequence Information Based on Random Projection Model and Fast Fourier Transform.

Zhan-Heng Chen,Zhu-Hong You,Li-Ping Li,Leon Wong,Hai-Cheng Yi,Yan-Bin Wang

doi:10.3390/ijms20040930

Abstract

It is significant for biological cells to predict self-interacting proteins (SIPs) in the field of bioinformatics. SIPs mean that two or more identical proteins can interact with each other by one gene expression. This plays a major role in the evolution of protein‒protein interactions (PPIs) and cellular functions. Owing to the limitation of the experimental identification of self-interacting proteins, it is more and more significant to develop a useful biological tool for the prediction of SIPs from protein sequence information. Therefore, we propose a novel prediction model called RP-FFT that merges the Random Projection (RP) model and Fast Fourier Transform (FFT) for detecting SIPs. First, each protein sequence was transformed into a Position Specific Scoring Matrix (PSSM) using the Position Specific Iterated BLAST (PSI-BLAST). Second, the features of protein sequences were extracted by the FFT method on PSSM. Lastly, we evaluated the performance of RP-FFT and compared the RP classifier with the state-of-the-art support vector machine (SVM) classifier and other existing methods on the human and yeast datasets; after the five-fold cross-validation, the RP-FFT model can obtain high average accuracies of 96.28% and 91.87% on the human and yeast datasets, respectively. The experimental results demonstrated that our RP-FFT prediction model is reasonable and robust.

Highlights

Protein is an important component of all cells
The main idea of our proposed method includes four aspects: (1) the protein sequence information could be described as a Position-Specific Scoring Matrix (PSSM); (2) using the fast Fourier transform (FFT) method to extract eigenvectors from protein sequences on a PSSM; (3) using the Principal Component Analysis (PCA) approach to convert the high-dimensional data into useful information after Fast Fourier Transform (FFT) and the noise is removed, so the pattern in the data is found; (4) the random projection (RP) algorithm is employed to build a training set where the classifier will be trained
To estimate the stability and availability of our prediction model, we used five measurements that were commonly used in binary classification tasks, including accuracy (Acc.), sensitivity (Sen.), specificity (Spe.), Matthews correlation coefficient (MCC) [26,27,28,29,30,31,32], and Balanced Accuracy (B_Acc.) [33], respectively

Summary

Introduction

Protein is an important component of all cells. It is an organic macromolecule and the basic material of life. The main idea of our proposed method includes four aspects: (1) the protein sequence information could be described as a Position-Specific Scoring Matrix (PSSM); (2) using the fast Fourier transform (FFT) method to extract eigenvectors from protein sequences on a PSSM; (3) using the Principal Component Analysis (PCA) approach to convert the high-dimensional data into useful information after FFT and the noise is removed, so the pattern in the data is found; (4) the RP algorithm is employed to build a training set where the classifier will be trained Take it in detail as follows: first, the PSSM from each protein sequence is likely to result in a eigenvector whose dimension is 400 by applying the FFT method for extracting important information; reduce the dimension of the FFT vector to 300 for improving the performance of prediction by employing the PCA dimensionality reduction method; eventually, perform classification on yeast and human datasets by applying the RP classifier. This indicates that the proposed model is suitable and performs well for predicting SIPs

Performance Evaluation

Performance of the Proposed Method

Comparison with Other Feature Extraction Methods

Comparison with the SVM-Based Method

Datasets

Position-Specific Scoring Matrix

Fast Fourier Transform

Support Vector Machine

Random Projection Classifier

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Molecular Sciences	Publication Date: Feb 21, 2019
Citations: 30	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Prediction of Self-Interacting Proteins from Protein Sequence Information Based on Random Projection Model and Fast Fourier Transform.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Molecular Sciences

Lead the way for us

Similar Papers

Identification of self-interacting proteins by integrating random projection classifier and finite impulse response filter
Zhan-Heng Chen ... Zhu-Hong You
BMC Genomics | VOL. 20
Zhan-Heng Chen, et. al.Zhan-Heng Chen ... Zhu-Hong You
01 Dec 2019
BMC Genomics | VOL. 20

An Improved Deep Forest Model for Predicting Self-Interacting Proteins From Protein Sequence Using Wavelet Transformation.
Zhan-Heng Chen ... Ji-Ren Zhou
Frontiers in Genetics | VOL. 10
Zhan-Heng Chen, et. al.Zhan-Heng Chen ... Ji-Ren Zhou
01 Mar 2019
Frontiers in Genetics | VOL. 10

Robust and accurate prediction of self-interacting proteins from protein sequence information by exploiting weighted sparse representation based classifier
Yang Li ... Xue-Gang Hu
BMC Bioinformatics | VOL. 23
Yang Li, et. al.Yang Li ... Xue-Gang Hu
01 Dec 2022
BMC Bioinformatics | VOL. 23

An Effective Computational Method for Predicting Self-Interacting Proteins Based on VGGNet Convolutional Neural Network and Gray-Level Co-occurrence Matrix.
Dan-Hua Chu ... Xiao-Mei Nie
Evolutionary bioinformatics online | VOL. 20
Dan-Hua Chu, et. al.Dan-Hua Chu ... Xiao-Mei Nie
01 Jan 2024
Evolutionary bioinformatics online | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Prediction of Self-Interacting Proteins from Protein Sequence Information Based on Random Projection Model and Fast Fourier Transform.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Molecular Sciences