Tyrosinase plays a crucial role as an enzyme in the production of melanin, which is the pigment accountable for determining the color of the hair, eyes, and skin. Tyrosinase inhibitory peptides (TIPs), mainly designed to regulate the activity of the enzyme tyrosinase, are of interest in various domains, including cosmetics, dermatology, and pharmaceuticals, due to their potential applications in controlling skin pigmentation. To date, a few machine learning-based models have been proposed for predicting TIPs, but their predictive performance remains unsatisfactory. In this study, we propose an innovative computational approach, named TIPred-MVFF, to accurately predict TIPs using only sequence information. Firstly, we established an up-to-date and high-quality dataset by collecting samples from various sources. Secondly, we applied a multi-view feature fusion (MVFF) strategy to extract and explore probability and category information embedded in TIPs, employing several machine learning (ML) algorithms coupled with different commonly used sequence-based feature encodings. Then, we employed resampling approaches to address the class imbalance issue. Finally, to maximize the utility of each feature, we fused probability-based and sequence-based features, generating more informative feature that were used to develop the final prediction model. Based on the independent test, experimental results showed that TIPred-MVFF outperformed several conventional ML classifiers and existing methods in terms of prediction accuracy and robustness, achieving an accuracy of 0.937 and a Matthew’s correlation coefficient of 0.847. This new computational approach is anticipated to aid community-wide efforts in rapidly and cost-effectively discovering novel peptides with strong tyrosinase inhibitory activities.
Read full abstract