T-cell receptors (TCRs) play a critical role in the immune response by recognizing specific ligand peptides presented by major histocompatibility complex (MHC) molecules. Accurate prediction of peptide binding to TCRs is essential for advancing immunotherapy, vaccine design, and understanding mechanisms of autoimmune disorders. This study presents a novel theoretical method that explores the impact of feature selection techniques on enhancing the predictive accuracy of peptide binding models tailored for specific TCRs. To evaluate the universality of our approach across different TCR systems, we utilized a dataset that includes peptide libraries tested against three distinct murine TCRs. A broad range of physicochemical properties, including amino acid composition, dipeptide composition, and tripeptide features, were integrated into the machine learning-based feature selection framework to identify key features contributing to binding affinity. Our analysis reveals that leveraging optimized feature subsets not only simplifies the model complexity but also enhances predictive performance, enabling more precise identification of TCR-peptide interactions. The results of our feature selection method are consistent with findings from hybrid approaches that utilize both sequence and structural data as input as well as experimental data. Our theoretical approach highlights the role of feature selection in peptide-TCR interactions, providing a powerful tool for uncovering the molecular mechanisms of the T-cell response and assisting in the design of more advanced targeted therapeutics.
Read full abstract