Abstract
Cancer immunotherapy enhances the body's natural immune system to combat cancer, offering the advantage of lowered side effects compared to traditional treatments because of its high selectivity and efficacy. Utilizing computational methods to identify tumor T cell antigens (TTCAs) is valuable in unraveling the biological mechanisms and enhancing the effectiveness of immunotherapy. In this study, we present ENCAP, a predictor for TTCA based on ensemble classifiers and diverse sequence features. Sequences were encoded as a feature vector of 4349 entries based on 57 different feature types, followed by feature engineering and hyperparameter optimization for machine learning models, respectively. The selected feature subsets of ENCAP are primarily composed of physicochemical properties, with several features specifically related to hydrophobicity and amphiphilicity. Two publicly available datasets were used for performance evaluation. ENCAP yields an AUC (Area Under the ROC Curve) of 0.768 and an MCC (Matthew's Correlation Coefficient) of 0.522 on the first independent test set. On the second test set, it achieves an AUC of 0.960 and an MCC of 0.789. Performance evaluations show that ENCAP generates 4.8% and 13.5% improvements in MCC over the state-of-the-art methods on two popular TTCA datasets, respectively. For the third test dataset of 71 experimentally validated TTCAs from the literature, ENCAP yields prediction accuracy of 0.873, achieving improvements ranging from 12% to 25.7% compared to three state-of-the-art methods. In general, the prediction accuracy is higher for sequences of fewer hydrophobic residues, and more hydrophilic and charged residues. The source code of ENCAP is freely available at https://github.com/YnnJ456/ENCAP.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.