Some specific kinds of proteins are responsible for the risk of immediate type I allergic reaction. Therefore, the proteins that are made to use in the consumer product should be checked for their allergic reactions before introducing them in the market. The FAO/WHO instructions for the assessment of allergic proteins depend on the linear sequence window identity and short peptide hits misclassify many proteins as allergen proteins. This study introduces the AllerPredictor model that predicts the allergen & non-allergen proteins depending on the sequence of proteins. Data was downloaded from two major databases, FARRP and UniProtKB. The results of this model were validated with the help of self-consistency testing, independence testing, and jackknife testing. The accuracy for self-consistency validation is 99.89%, for the independence testing is 74.23%, and for 10-fold cross-validation, it is 97.17%. To predict the allergen and non-allergen proteins, this AllerPredictor model has a better accuracy than other existing methods.
Read full abstract