Abstract

Accurately recognizing nitrated tyrosine residues from protein sequences would pave a way for understanding the mechanism of nitration and the screening of the tyrosine residues in sequences. In this study, we proposed a prediction model that used the extreme learning machine (ELM) algorithm as the prediction engine to identify nitrated tyrosine residues. To encode each tyrosine residue, a sliding window technique was adopted to extract a peptide segment for each tyrosine residue, from which a number of features were extracted. These features were analyzed by a popular feature selection method, Minimum Redundancy Maximum Relevance (mRMR) method, producing a feature list, in which all features were ranked in a rigorous way. Then, the Incremental Feature Selection (IFS) method was utilized to discover the optimal features, on which the optimal ELM-based prediction model was built. This model produced satisfactory results on the training dataset with a Matthews correlation coefficient of 0.757. The model was also evaluated by an independent test dataset that contained only positive samples, yielding a sensitivity of 0.938. Compared to other prediction models that use classic machine learning algorithms as prediction engines on the same datasets with their own optimal features, the optimal ELM-based prediction model produced much better results, indicating the superiority of the proposed model for the identification of nitrated tyrosine residues from protein sequences.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call