Abstract

It has recently become possible to simultaneously assay T‐cell specificity with respect to large sets of antigens and the T‐cell receptor sequence in high‐throughput single‐cell experiments. Leveraging this new type of data, we propose and benchmark a collection of deep learning architectures to model T‐cell specificity in single cells. In agreement with previous results, we found that models that treat antigens as categorical outcome variables outperform those that model the TCR and antigen sequence jointly. Moreover, we show that variability in single‐cell immune repertoire screens can be mitigated by modeling cell‐specific covariates. Lastly, we demonstrate that the number of bound pMHC complexes can be predicted in a continuous fashion providing a gateway to disentangle cell‐to‐dextramer binding strength and receptor‐to‐pMHC affinity. We provide these models in the Python package TcellMatch to allow imputation of antigen specificities in single‐cell RNA‐seq studies on T cells without the need for MHC staining.

Highlights

  • 20 Antigen recognition is one of the key factors of T-cell-mediated immunity

  • A joint deep learning model for alpha- and beta-chain, antigens, and covariates Before the introduction of single-cell T-cell receptor (TCR) reconstruction with coupled antigen binding detection via peptides immobilized on MHC multimers (pMHC) (Fig. 1a), most paired observations of TCR and bound antigen only 50 included the TCR β-chain, which are often found in entries of databases such as IEDB6 or VDJdb[7]

  • We explore a data set based on single-cell pMHC capture in which paired ɑand β-chain could be successfully reconstructed for 10,000s of cells and binding-specificity measured for 44 distinct pMHC complexes[8]

Read more

Summary

Introduction

The ability to accurately predict T-cell activation upon epitope recognition would have transformative effects on many research areas from in infectious disease, autoimmunity, vaccine design, and cancer immunology, but has been thwarted by lack of training data and adequate models. Due to lack of sufficient data, 30 previous models for T-cell specificity were only based on the CDR3β loop[3,4,5]. We exploit a newly developed single-cell technology that enables the simultaneous sequencing of the paired TCR ɑ- and β-chain while determining the T-cell specificity to train multiple deep learning architectures modeling the TCR-pMHC interaction including both chains. The models include single-cell specific covariates accounting for the 35 variability found in such data, thereby fully exploit the multiplicity of observations that can be sampled in single-cell screens. To facilitate the usage of our predictive algorithms, we built the python package TcellMatch that hosts a pre-trained model zoo for analysts to impute pMHC-derived antigen specificities and allows transfer and re-training of models on new data sets

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call