The prediction of human leukocyte antigen (HLA) class II binding peptides plays important roles in understanding the mechanism of immune recognition and developing effective epitope-based vaccines. In this work, gated recurrent unit (GRU)-based recurrent neural network (RNN) was successfully employed to establish a pan-specific prediction model of HLA-II-binding peptides by using only the HLA and peptide sequence information. In comparison with the existing pan-specific models of HLA-II-binding peptides, the GRU-based RNN model covered a broad spectrum of HLA-II molecules including 50 HLA-DR, 47 HLA-DQ, and 19 HLA-DP molecules with peptide lengths varying from 8 to 43 mers. The results demonstrated strong discriminant capabilities of the GRU-based RNN model, of which the AUC values were 0.92, 0.88, and 0.88 for the training, validation, and test sets, respectively. Also, the GRU-based model showed state-of-the-art performances in predicting the binding peptides with the length ranging from 8–32 mers, which provides an efficient method for predicting HLA-II-binding peptides of longer lengths in comparison with the available methods. Overall, taking the advantages of the RNN architecture, the established pan-specific GRU model can be used for predicting accurately the HLA-II-binding peptides in a simple and direct manner.
Read full abstract