Abstract

Full text Figures and data Side by side Abstract Editor's evaluation Introduction Results Discussion Methods Data availability References Decision letter Author response Article and author information Metrics Abstract Novel single-cell-based technologies hold the promise of matching T cell receptor (TCR) sequences with their cognate peptide-MHC recognition motif in a high-throughput manner. Parallel capture of TCR transcripts and peptide-MHC is enabled through the use of reagents labeled with DNA barcodes. However, analysis and annotation of such single-cell sequencing (SCseq) data are challenged by dropout, random noise, and other technical artifacts that must be carefully handled in the downstream processing steps. We here propose a rational, data-driven method termed ITRAP (improved T cell Receptor Antigen Paring) to deal with these challenges, filtering away likely artifacts, and enable the generation of large sets of TCR-pMHC sequence data with a high degree of specificity and sensitivity, thus outputting the most likely pMHC target per T cell. We have validated this approach across 10 different virus-specific T cell responses in 16 healthy donors. Across these samples, we have identified up to 1494 high-confident TCR-pMHC pairs derived from 4135 single cells. Editor's evaluation This paper is of interest to immunologists conducting single-cell analyses of T-cell recognition. It provides improved means of curating datasets to reduce noise and identify T cell-antigen pairs with greater confidence. Experimental data from human virus-specific TCRs are used to validate the methodology. https://doi.org/10.7554/eLife.81810.sa0 Decision letter Reviews on Sciety eLife's review process Introduction T cells are essential for immune protection and play a critical role in the immune response to pathogens or cancer, where they directly kill infected or malignant host cells or orchestrate the response of other immune cells. Recognition is mediated through the heterodimeric T-cell receptor (TCR) expressed on the surface of T cells, which engages specifically with a peptide antigen (p) displayed in the MHC. Accurate specificity and broad coverage of antigen recognition are obtained through somatic recombination of the genetic loci, V(D)J, that encodes the α (VJ) and β (VDJ) chains of TCR. The process creates an extensively variable and dynamic repertoire, with an estimated 107 distinct αβTCRs in an individual (Arstila et al., 1999; Davis and Bjorkman, 1988). Due to this diversity, the individual TCR transcripts can be used as endogenous cellular barcodes inherited by the T cell progeny. This has been utilized for providing quantitative insight into TCR diversity (Robins et al., 2009), to trace lineage decisions of T cells (Gerlach et al., 2013) and to monitor the dynamics of T cells across immune-related diseases, such as infectious disease (Dziubianau et al., 2013; Hou et al., 2016), cancer (Kirsch et al., 2015; Sherwood, 2013; Zhang et al., 2018) and autoimmunity (Acha-Orbea et al., 1988; Madi et al., 2014). Most of such TCR repertoire studies have been confined to bulk experiments, tracing the TCR β chain because of its greater diversity (compared to the alpha chain) and because it is less ambiguous due to allelic exclusion (Bergman, 1999). However, accurate pairing of the variable TCR α and β regions is valuable for uncovering the biological function of a T cell and is generally lost in bulk experiments since the transcripts are separately encoded. Moreover, we and others have earlier demonstrated that paired α and β TCR data are essential for the characterization and learning of the relationship between the TCR sequence and specificity (Montemurro et al., 2021). To accurately obtain TCR αβ-sequence-pair, single-cell sequencing platforms can be applied to simultaneously capture both TCR chains, while retaining cell origin information. To further assign specificity information to such TCRs, T cells can be stained with barcode-labeled pMHC multimers to simultaneously identify pMHC specificity and TCR sequence of individual cells (Bentzen et al., 2016; Zhang et al., 2018). Moreover, via DNA barcoded antibodies, the platform facilitates screening of surface proteins to distinguish cellular subtypes and enables cell hashing to trace origin of a given cell to, for example, a given donor, sample, or time point, which is highly valuable in patient studies. Here we thus applied single-cell sequencing to describe the T cell specificities toward a set of viral-derived peptide-MHC (pMHC) complexes. The pMHCs were selected with the purpose of generating data to expand the current knowledge of TCR-pMHC interactions, and hence covered pMHCs with limited or no paired TCR coverage in the public-domain databases such as IEDB (Vita et al., 2019) and VDJdb (Bagaev et al., 2020). We deployed the droplet-based single-cell platform from 10x Genomics. Ideally a droplet contains a single cell with all its analytes and a gel-bead in emulsion (GEM). The gel-bead contains barcoded primers that ensures tracing of transcripts back to the cell of origin, referred to as GEMs. While the platform is highly promising, the sequence deconvolution is associated with substantial noise, and challenging to discriminate true from false signals. Common confounding factors include stochastic gene expression, cell cycle variations, apoptosis, and technical artifacts such as multiplet capture, contamination, dropout, and batch effects. Dropout and stochastic gene expression both result in zero-inflated gene counts and are typically insensitive to low expression levels (Buettner et al., 2015; Kharchenko et al., 2014; Yamawaki et al., 2021). Multiplet capture is the event of capturing two or more cells in a single GEM, and it is proportional to the capture rate of cells introduced to the system (Bloom, 2018; Zheng et al., 2017). The capture rate is determined by the rate of pulsing cells relative to the rate of gel-beads. Thus, to include even low-frequency cell populations, the capture rate must be high at the expense of introducing more multiplets. Contamination is particularly an issue when including analytes such as pMHC multimers, which may be dissolved in cell suspension (Gaublomme et al., 2019). The platform has no means of controlling how ambient analytes and their barcodes are partitioned with GEMs, which leads to GEMs that appear like multiplets or consist of ambiguous annotations from multiple analyte barcodes. The reverse issue arises from the risk that analytes may dissociate from the cell before capture. The listed confounders may result in both false-positive and false-negative discoveries. The main concerns when screening for TCR specificity are nonspecific binding of pMHC and/or cell hashing analytes, incomplete TCR annotation, and T cell multiplets. Nonspecific binding and T cell multiplets may completely dilute the signal from actual interactions, while incomplete TCRs that are missing the annotation for either α- or β-chain render the single-cell setup superfluous. To ensure that a screening is fully exploited and interpreted correctly, we set out to develop a data-driven algorithm that facilitates a consistent and reproducible TCR categorization (clonotyping), peptide-MHC (pMHC) annotation, and antibody-based cell hashing referencing of the donors and their HLA profile. We applied this algorithm to a dataset derived from screening PBMCs from 16 healthy donors for T cell recognition against common viruses. In total, we evaluated TCR recognition against 10 different pMHC multimers, each labeled with their unique barcode. We demonstrate that following the filtering steps described here we can obtain a confident pairing of pMHC specificity and TCR sequence. This strategy will open novel opportunities to evaluate the structural interplay and the sequence-driven signatures of pMHC recognition at large scale. Results Parallel capture of TCRɑβ sequences, peptide-MHC specificity, and sample origin from single cells To obtain single-cell-derived triad information on TCR sequence, pMHC specificity, and sample origin, we stained peripheral blood mononuclear cells (PBMCs) from a total of 16 different healthy donors (Supplementary file 1). All samples were stained with the same panel composed of 10 different viral-derived pMHC multimers, each labeled with a unique barcode for that specificity and a common fluorescent label (allophycocyanin [APC]) (Figure 1; Supplementary file 2). We wished to enrich only for TCRs responsive to less commonly reported pMHCs, hence we co-stained the cells with the three most commonly reported viral-derived pMHC multimers (all A0201 restricted: GLC, GIL, NLV) bearing a different fluorochrome (phycoerythrin [PE]) and labeled with their own unique DNA barcode (Supplementary file 2). We sorted only the APC-labeled pMHC multimer binding T cells (and hence deselected the PE-labeled T cells) and included these in the downstream single-cell processing. Figure 1 with 1 supplement see all Download asset Open asset Schematic of experimental design. (A) Schematic of the experimental strategy. All samples are incubated with the same library of barcode-labeled pMHC multimers and subsequently with a sample-specific barcode-labeled hashing antibody to individually label cells derived from a given sample. Multimer-binding cells from all samples are sorted in bulk and processed through the 10x Chromium workflow. The sequencing output simultaneously captures the sample barcode, the pMHC barcode, and the TCR sequences, which are all matched to a single cell based on the 10x barcode. This also provides the means of retrospectively assigning each cell to their sample of origin via the sample-specific hashing barcode. (B) Example showing how the allophycocyanin (APC)-labeled pMHC multimers are sorted collectively from all samples into one tube that is further carried into the 10x workflow. The phycoerythrin (PE)-labeled pMHC multimers are not sorted and hence deselected. A total of 1800 APC-labeled cells are sorted from each donor. Here showing BC126 (large dotplot) and BC341 (small dotplot). Prior to sorting, each sample was stained with a distinct hashing antibody to provide a sample identification barcode associated with the GEMs of the resultant single-cell data set. This is done to enable mixing of cells from different samples, while retaining the information of sample origin, and utilizing the capacity of capturing 6000–10,000 cells per lane in the 10x Genomics workflow. This is essential when capturing T cells based on their specificity since the MHC multimer-positive population is generally of low frequency (<1% of CD8 T cells). When several samples are mixed in the process of running the single-cell analysis, all mRNA and DNA barcodes (derived from hashing antibodies or the MHC multimers) associated with a given cell will be encoded with the same 10x barcode, proving the GEM association (Figure 1; Supplementary file 1). Total data from simultaneous capture of cell, TCR, pMHC, and sample ID The single-cell data is annotated using 10x Chromium Cellranger multi v6.1. This results in each GEM being quantified by a count of unique molecular identifiers (UMIs) (Kivioja et al., 2011) for the three components (TCR, pMHC, and sample hashing) based on transcripts of TCR α- and β-chains, barcodes co-attached to pMHC multimers and barcodes co-attached to cell hashing antibodies (Supplementary file 2). To obtain the data presented here, a total of 1800 pMHC multimer-positive cells were sorted per donor irrespective of the frequency or the number of different antigen-specific T cell responses in a given sample, accumulating to a total of 28,800 cells sorted (Figure 1—figure supplement 1). All sorted cells were loaded into a single lane for 10x processing. Based on experience with pre-sorting of low frequent cell populations, this equals a total of 6000–9000 captured cells per lane after running the full 10x Genomics 5′ pipeline, and an acceptable doublet rate. This indicates that an appropriate proportion of cells are loaded on the Chromium. Initially, each GEM was annotated based on the most abundant transcripts from TCRαβ, pMHC, and cell hashing. However, this can lead to erroneous annotations as the noise level can differ substantially for the different reagents, resulting in different levels of UMIs. Based on raw, unfiltered data, we found 6073 GEMs that contained all three components, that is, TCR, pMHC, and sample hashing, corresponding to 40% of the loaded cells (Figure 2a). A total of 716,069 GEMs only contained one or two of the components, with the majority containing only the cell hashing barcode (n = 677,502) and the second largest share containing cell hashing as well as pMHC barcodes (n = 37,277). This number vastly exceeds the number of cells in the assay (15,700 cells loaded) and indicates contamination from ambient barcodes in suspension. This is further supported by the observation that the sample hashing UMI count was significantly higher (p<0.0005, Mann–Whitney U) in the 6073 GEMs containing a TCR compared to the GEMs void of TCR (Figure 2b). A total of 43,455 GEMs captured a DNA barcode associated with the pMHC library and only 14% of these were completed with TCR transcripts and sample hashing barcodes. In the GEMs containing a TCR, 84% were completed with all three components, that is, included hashing and pMHC barcodes, while less than 0.05% of these GEMs were void of both sample hashing and pMHC barcodes. In the following, we will only consider the 6073 GEMs containing all three components, while taking into account that the high degree of noise also affects these seemingly completely mapped GEMs. Figure 2 Download asset Open asset Summary of raw data. (a) Venn diagram of the content of all gel-beads in emulsion (GEMs) from 10x Chromium drop-seq. Each GEM is expected to contain three components: transcripts of TCR and DNA barcodes from the target pMHC multimer as well as the sample hashing antibody. The Venn diagram illustrates the extent of GEMs with complete capture (capture of all three components) in contrast to the GEMs with incomplete capture (capture of a subset of components). (b) Comparison of distributions of unique molecular identifier (UMI) counts of sample hashing barcode between GEMs that contain TCR transcripts (TCR-occupied GEMs) and GEMs that do not contain TCR transcripts (TCR-void GEMs) (p<0.0005, Mann–Whitney U). (c) Matrix of the distribution of pMHC singlets and multiplets across GEMs with TCRs either missing a chain, detected with multiple chains, or with a single, unique αβ-pair. The counts are given for each field and illustrated by a color. The lighter color represents higher counts. (d) Scatterplot of all detected pMHC barcodes (y-axis) within each of the 6073 GEMs (x-axis) that contain all three components: TCR, pMHC, and sample hashing. In each GEM, the most abundant pMHC is marked with green, while the remaining pMHCs in the GEM are gray. The marker size reports the UMI count of the given pMHC. The marker shape and color recount whether the HLA allele of the pMHC matches the HLA haplotype of the donor, which is deduced from sample hashing (yellow x: non-matching HLA). The fraction of HLA matches within the GEMs displaying a given specificity is annotated to the right of the plot. The first colorbar indicates the type of TCR chain annotation; whether the TCR has a unique αβ-pair, is missing a chain or consists of multiple chains. The second colorbar is a specificity check against the specificity databases IEDB and VDjdb. Colors highlight the GEMs where the CDR3αβ sequences are contained in the databases. The green color represents a match between the database pMHC and the detected pMHC, while red indicates a mismatch. The GEMs are distributed across three categories of TCR and two categories of pMHC observations: GEMs either missing a TCR chain, contain multiple TCR chains, or contain a unique TCRαβ-pair and GEMs containing either a single or multiple pMHC barcodes (Figure 2c). Sample hashing multiplets constitute 100% of GEMs containing sample hashing barcodes, and there is both a large proportion of pMHC multiplets (65%) and GEMs missing either α- or β TCR-chain (39%), hence, multiplets of pMHC and sample hashing is the predominant issue. Few GEMs were detected with multiple TCR α- or β-chains (6%). This may be caused partly by naturally occurring multiplets of α-chain (4%) due to the incomplete gene restriction of the thymocyte during negative selection (Elliott and Altmann, 1995; Petrie et al., 1993) or due to experimental features of the 1ox platform causing an expected 6.9% of multiplets based on the number of cells loaded in our experiment. Without further filtering, the pMHC-TCR pairing is subjected to extensive noise (Figure 2d), and we capture all the 10 DNA barcodes associated with the APC-labeled pMHCs in a varying number of GEMs. Importantly, the three negative control responses (GIL A0201, GLC A0201, and NLV A0201), which were present in the donors but not sorted, are only captured in a few GEMs; both as the most abundant pMHC (GIL: 4 GEMs/clonotypes; GLC: 17 GEMs/clonotypes; and NLV: 12 GEMs/clonotypes) and as presumed contamination (i.e., examples where the UMI count of the negative control(s) was not the most abundant). In this latter case, the vast majority (84%) of the negative control pMHCs had UMI counts of 1. Four of the abundant negative control responses matched known IEDB/VDJdb responses. This indicates that the cell isolation via sorting is effective in terms of capturing only the desired cells and relevant pMHC-associated barcode labels. The most frequently detected pMHC across all GEMs is RVR A0301, which is present with high UMI counts across all GEMs. Only RPH(10-mer) B0702-associated UMIs was consistently detected at low numbers per GEM. It was also evaluated whether the HLA allele of the pMHC matches the HLA haplotype of the donor(s) given via cell hashing (Figure 2d). Typically, the mismatches are found in GEMs where the most abundant pMHC is detected at low UMI counts while the matches consist of GEMs with higher pMHC UMI counts. Of the 65% GEMs containing pMHC multiplets (Figure 2c), 13% contained two or more pMHCs at the exact same UMI level (Supplementary file 3), which may either represent noise or true cross-binding events. The detected specificities in our data have been cross-referenced with the IEDB (Vita et al., 2019) and VDJdb (Bagaev et al., 2020) databases (Figure 2d). Based on the unfiltered data, we found five TCR-pMHC matches (across nine GEMs) and one TCR (one GEM), which was annotated with a different pMHC (Figure 2d). This latter is a case of a GEM with multiple pMHCs present with almost equal number of UMIs, where the most abundant pMHC is RVR A0301 (11 UMIs) and the second most abundant pMHC is GLC A0201 (9 UMIs), which is the peptide registered as target in IEDB and VDJdb. The data in Figure 2d suggests that most of the captured T cells interact with several of the screened pMHCs to a degree that exceeds the level expected from natural cross-recognition. Therefore, it is reasonable to assume that a large proportion of these multiplets are formed as a result of ambient pMHC leaking into GEMs. A data-driven filtering approach From these observations, it is clear that a substantial part of the data consists of noise, that is, GEMs with multiplets of pMHC and sample hashing, and that the data must be filtered for proper interpretation. Clonotype annotation The definition of T-cell clones (clonotypes) is fundamental for pairing a given TCR clonotype to its respective pMHC recognition. Initial clonotypes were called using 10x Genomics Cellranger, which defines a clonotype as a set of cells that share identical receptor sequences at the nucleotide level, spanning the entirety of the V(D)J-C genes as well as the junction segments. Assuming reliable gene and CDR3 sequence calls by 10x Cellranger, we redefine clonotypes based on TCR annotation. Subsequently, GEMs with no clonotype annotation from 10x were annotated to existing clonotypes conditioned on matching VJαβ-genes and CDR3αβ sequences or as novel clonotypes. Similarly, clonotypes with identical VJ-CDR3αβ were merged to form larger groups of theoretically identical TCRs (Figure 3—figure supplement 1). Merging GEMs of the same TCR is essential to make statistical inference based on those groupings, for example, determine expected pMHC target per clonotype. The outcome was a set of 2441 TCR clonotypes across the 6073 GEMs containing both TCR and pMHC. For the 337 GEMs containing TCR chain multiplets, the most abundant chain per GEM was for the subsequent analyses selected to represent the true TCR. Note that this annotation was made post the definition of clonotypes and was applied for the TCR inter- versus intra-similarity comparison. Defining pMHC recognition for selected TCR clonotypes As we have seen earlier, not all GEMs within a given clonotype support the same pMHC target, and defining the pMHC target of a TCR based on individual GEMs thus results in contradicting annotations. The key to identify the expected target for a clonotype is therefore to determine which pMHC identity represents the majority of UMIs across all GEMs within a given clonotype. Figure 3 illustrates an example from a pilot study that accentuates the importance of studying GEMs in ensemble rather than individually. Most GEMs are annotated with multiplets of pMHCs and across all GEMs the most abundant pMHC varies. While all pMHCs are found most abundant in at least one GEM, three pMHCs (TPR B0702, VTE A0101, and RAK B0801) are more often found most abundant (Figure 3a). Although TPR B0702 is detected in fewer GEMs (136) than VTE A0101 (260) and RAK B0801 (186), TPR B0702 is present at generally higher UMI counts (Figure 3b). It is evident that there is a difference in UMI distributions between the different pMHC within the GEMs of a given clonotype, and that TPR B0701 is the significantly most abundant pMHC across the ensemble of GEMs even though this pMHC is only present in a minority proportion of the GEM (Figure 3b). Based on these observations, we argue that the significantly most abundant pMHC should be annotated as the expected binder for the given clonotype rather than annotating based on the majority. Figure 3 with 3 supplements see all Download asset Open asset An example of pMHC concordance in clonotype 1 (example from pilot study). (a) All detected pMHC (y-axis) in each gel-bead in emulsion (GEM) (x-axis, n = 467) of clonotype 1. The marker size shows the unique molecular identifier (UMI) count for the particular pMHC in a given GEM, and the color indicates the pMHC with the highest UMI count, similar to what is shown in Figure 1d. If two pMHCs are equally abundant in a GEM, they are both colored. No marker means no detection of that pMHC in that given GEM. (b) The compiled distribution of UMI counts for each peptide (assigning 0 UMI when the pMHC is not detected in a GEM). The asterisk marks that a Wilcoxon test showed that the UMI counts of TPR B0702 were on average higher than for VTE A0101 UMI counts. (c) The specificity concordance across the GEMs of clonotype 1. Concordance is shown by a color gradient, that is, the larger the fraction of GEMs supporting a given specificity the darker the color. Having annotated the expected pMHC of a given clonotype, one can next go back to the individual GEMs, and label GEMs where the most abundant pMHC corresponds to the expected binder, as ‘true,’ and all others as ‘false,’ and use these annotations to quantify the accuracy of the GEM annotations. Within each clonotype, one can compute a specificity concordance, that is, the fraction of GEMs detected with a certain specificity (defined by most abundant pMHC, i.e., highest pMHC UMI per GEM) (Figure 3c). In many cases across the full data set, the expected specificity for a clonotype coincides with the specificity, defined on a per-GEM level, resulting in high concordance. However, for some clonotypes, for example, clonotype 1, GEMs have diverging annotations and therefore lower concordance dispersed across multiple specificities (Figure 3). The clonotype visualized in Figure 3 is specifically chosen to exemplify how this lower concordance can affect the analysis. For clonotype 1, the fraction of GEMs that support VTE A0101 (0.33) is higher than the fraction of GEMs that supports TPR B0702 (0.26). This results in an overall low concordance, and only by considering the complete ensemble of clonotype 1 GEMs can the correct pMHC target be identified (Figure 3b). Improving concordance between GEM and clonotype annotation based on grid search on UMI features To rationally filter data, an accuracy metric was defined and optimized through the filtering process. For all specificities belonging to clonotypes with an assigned expected target, we calculated the overall accuracy as the proportion of GEMs where highest abundance pMHC annotation corresponds to the expected target of the clonotype. The raw unfiltered data yielded accuracy and average concordance scores of 69.6 and 83.8%, respectively. Next, we set out to investigate how different data-driven UMI filters could improve these performance values, removing noise and artifacts from the data. This removal would also reduce the number of included observations, hence the performance of different thresholds for filtering the data was evaluated based on a tradeoff between increased accuracy and discarded number of GEMs. We tested various thresholds on UMI count and UMI ratios, that is, the ratio between the most abundant and second most abundant UMI feature, for pMHC and TCRαβ, respectively. The optimal thresholds were chosen to maximize the weighted average between accuracy and fraction of retained GEMs to favor increase in accuracy above losing some GEMs. This filtering analysis resulted in optimal thresholds of two pMHC UMI counts and a ratio pMHC UMI counts between top one and two >1. The latter results in the removal of GEMs where two pMHC were equally abundant for low UMI counts. The search did not result in thresholds imposing restrictions on neither TCR UMI counts nor TCR UMI ratio, which underpins that the TCRs with a missing chain as well as multiple chains also contribute to good performance. Imposing this filter yielded 4986 GEMs (82% of total), 1494 clonotypes (61% of total), and resulted in 95.3% accuracy, and a mean concordance of 90.6%. Additional filters Additional filters can be added to further clean the data. We investigated how an integrated filter in the 10x Genomics software, Cellranger, performed in removing potential noise from our data set (Figure 3—figure supplement 2). The filter (labeled ‘is cell’) evaluates whether a GEM has captured a cell based on full level of transcriptome data, when available, and otherwise solely on TCR transcript level. The filter was tested with both levels of transcript data, full level and TCR transcript level, which are respectively referred to as ‘is cell (GEX)’ and ‘is cell.’ Alternatively, viable cells are identified from the transcript data, independently of Cellranger, based on mitochondrial load and a minimum and maximum gene count per GEM. All three filterings are comparable (Figure 3—figure supplement 2) and taken into account in the further evaluations. It is worth noting that, while the filterings based on the full transcript data might remove slightly more noise, the economic costs associated could propose that this should only be applied when the transcript data is required for additional purposes. Cell hashing is generally a much simpler task to resolve than pMHC multiplets because one hashing entry most often has much higher counts compared to the others (Figure 3—figure supplement 3). Moreover, due to the experimental design, where only one hashing antibody is added to each sample, it is expected that only a single hashing signal is associated with each GEM, that is, this does not mirror the complex nature of the pMHC data, where cross-reactivity could result in more than one pMHC be a true binder to a given TCR. Given this simplicity, we opted for utilizing the existing Seurat hashtag oligo (HTO) tool to demultiplex and annotate cell hashing (Stoeckius et al., 2018). In this setup, cell hashing also enables filtering based on matching HLA between the donor haplotypes and the HLA of the detected pMHC. Including this additional filter reduces the number of GEMs to 4135 (covering a set of 1494 clonotypes). Additionally, depending on the subsequent use of the data, retaining only complete TCRs containing both α and β may be desirable. Including only GEMs where the TCR-pMHC pair is observed more than once, that is, specificity multiplets, reduces the uncertainty described above. Below we investigate the impact of imposing such filters. Impacts of filtering Evaluating filters by comparing TCR similarity across specificity To objectively evaluate the performance impact of the presented filters, we define a quantitative evaluation based on the hypothesis that T cells binding the same pMHC (intra-specificity) will share a higher sequence similarity compared to TCRs of different specificities (inter-specificity) (Figure 4). Thus, filtering away artifacts should increase intra-similarity while decreasing the inter-similarity. Here, the similarity score between two TCRs was calculated from the summed score of the pairwise α- and β-chain similarities calculated using a kernel method described in Shen et al., 2012 and applied in Chronister et al., 2021. Figure 4 Download asset Open asset Intra- and inter TCR similarity sc

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call