Abstract

In this paper, a wavelet-based time-series approach for protein classification problem was presented. A novel feature vector based on the variation of seven physicochemical properties (hydrophobicity, electronic, isoelectric point, polarity, volume, composition, and molecular weight) of amino acids was proposed in this article. The feature vector contains the wavelet variance information of physico-chemical properties of protein sequences. The dimension of the proposed feature vector is only 35 when compared with 400-dimensional feature vector for G protein coupled receptors technique(GPCR) pred and 512-dimensional feature vector for fast Fourier transform(FFT)-based approaches. The low dimension of the feature vector will facilitate the development of computational and memory-efficient classifiers for drug discovery applications. Experiments were performed on the complete data set that is available at GPCR database(GPCRDB). Tests were also conducted on unseen or independent data sets to measure the generalization capability of the proposed classification technique. Performance comparison with GPCRpred and FFT- based approaches shows that the proposed approach performs equally well with the existing programs. The proposed approach can also be applied for prediction of protein structural classes, identification of membrane protein type, enzyme family classification, and many others.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call