Abstract

ABSTRACTApoptosis is a fundamental process controlling normal tissue homeostasis by regulating a balance between cell proliferation and death. Predicting the subcellular location of apoptosis proteins is very helpful for understanding the mechanism of programmed cell death. Predicting protein subcellular localization with bioinformatic techniques provides quite a few opportunities in related fields. In this work, we propose the use of a hierarchical extreme learning machine (H-ELM) to make a classification of high-dimensional input data without demanding a dimension reduction process, which yields acceptable results. An attempt is made to extract features from different perspectives, and a feature fusion process is accomplished. Regarding the position-specific scoring matrix, the first type depicts the correlation within the sequence with the autocorrelation function for relatively random sections from the sequence; and the second type is the Kullback-Leibler (K-L) divergence of the two distributions formed by the amino acids’ constitutuent proportions. It is illustrated in an experiment with features from different sources mixed by simple concatenation yielding a poor result, but the synthetical feature fused with stochastic nonlinear embedding (t-SNE) greatly improved the classification. Finally, the highest overall accuracy of ZD98 is 87.5% by adjusting the hyper-parameters of H-ELM, and of CL317 is 92.4%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call