Abstract
BackgroundProtein structural class predicting is a heavily researched subject in bioinformatics that plays a vital role in protein functional analysis, protein folding recognition, rational drug design and other related fields. However, when traditional feature expression methods are adopted, the features usually contain considerable redundant information, which leads to a very low recognition rate of protein structural classes.ResultsWe constructed a prediction model based on wavelet denoising using different feature expression methods. A new fusion idea, first fuse and then denoise, is proposed in this article. Two types of pseudo amino acid compositions are utilized to distill feature vectors. Then, a two-dimensional (2-D) wavelet denoising algorithm is used to remove the redundant information from two extracted feature vectors. The two feature vectors based on parallel 2-D wavelet denoising are fused, which is known as PWD-FU-PseAAC. The related source codes are available at https://github.com/Xiaoheng-Wang12/Wang-xiaoheng/tree/master.ConclusionsExperimental verification of three low-similarity datasets suggests that the proposed model achieves notably good results as regarding the prediction of protein structural classes.
Highlights
Protein structural class predicting is a heavily researched subject in bioinformatics that plays a vital role in protein functional analysis, protein folding recognition, rational drug design and other related fields
Nanni et al proposed a new feature fusion method based on the features of the primary sequence and the features of the secondary structure based on prediction [29]
Type 1 pseudo amino acid composition was proposed by Chou in 2001 [43]
Summary
Protein structural class predicting is a heavily researched subject in bioinformatics that plays a vital role in protein functional analysis, protein folding recognition, rational drug design and other related fields. When traditional feature expression methods are adopted, the features usually contain considerable redundant information, which leads to a very low recognition rate of protein structural classes. To solve the problems of redundant information and low recognition rates, many computational methods have been proposed to predict protein structural classes during the past 30 years. One such method is the feature extraction method based on the information in amino acid sequences. Compared with the previous two methods, this method considered the sequence factor between amino acid residues These methods have achieved good prediction results on high similarity datasets but poor results on low similarity datasets. Liu et al used a recursive feature selection algorithm to select the optimal feature vector [32]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have