T-cell receptors (TCR) mediate immune responses recognizing peptides in complex with major histocompatibility complexes (pMHC) displayed on the surface of cells. Resolving the challenge of predicting the cognate pMHC target of a TCR would benefit many applications in the field of immunology, including vaccine design/discovery and the development of immunotherapies. Here, we developed a model for prediction of TCR targets based on similarity to a database of TCRs with known targets. Benchmarking the model on a large set of TCRs with known target, we demonstrated how the predictive performance is increased (i) by focusing on CDRs rather than the full length TCR protein sequences, (ii) by incorporating information from paired α and β chains, and (iii) integrating information for all 6 CDR loops rather than just CDR3. Finally, we show how integration of the structure of CDR loops, as obtained through homology modeling, boosts the predictive power of the model, in particular in situations where no high-similarity TCRs are available for the query. These findings demonstrate that TCRs that bind to the same target also share, to a very high degree, sequence, and structural features. This observation has profound impact for future development of prediction models for TCR-pMHC interactions and for the use of such models for the rational design of T cell based therapies.
Read full abstract