Major Histocompatibility Complex (MHC) molecules play a critical role in the immune system by presenting peptides on the cell surface for recognition by T-cells. Tumor cells often produce MHC peptides with amino acid mutations, known as neoantigens, which evade T-cell recognition, leading to rapid tumor growth. In immunotherapies such as TCR-T and CAR-T, identifying these mutated MHC peptide sequences is crucial. Current mass spectrometry-based peptide identification methods primarily rely on database searching, which fails to detect mutated peptides not present in human databases. In this paper, we propose a novel workflow called NeoMS, designed to efficiently identify both non-mutated and mutated MHC-I peptides from mass spectrometry data. NeoMS utilizes a tagging algorithm to generate an expanded sequence database that includes potential mutated proteins for each sample. Furthermore, it employs a machine learning-based scoring function for each peptide-spectrum match (PSM) to maximize search sensitivity. Finally, a rigorous target-decoy approach is implemented to control the false discovery rates (FDR) of the peptides with and without mutations separately. Experimental results for regular peptides demonstrate that NeoMS outperforms four benchmark methods. For mutated peptides, NeoMS successfully identifies hundreds of high-quality mutated peptides in a melanoma-associated sample, with their validity confirmed by further studies.
Read full abstract