A novel statistical method for decontaminating T-cell receptor sequencing data.

Ruoxing Li,Mehmet Altan,Jianjun Zhang,John V Heymach,Latasha Little,Ruitao Lin,Ziyi Li,Runzhe Chen,Shawna Hubert,Alexandre Reuben,Hai Tran

doi:10.1093/bib/bbad230

Abstract

The T-cell receptor (TCR) repertoire is highly diverse among the population and plays an essential role in initiating multiple immune processes. TCR sequencing (TCR-seq) has been developed to profile the T cell repertoire. Similar to other high-throughput experiments, contamination can happen during several steps of TCR-seq, including sample collection, preparation and sequencing. Such contamination creates artifacts in the data, leading to inaccurate or even biased results. Most existing methods assume 'clean' TCR-seq data as the starting point with no ability to handle data contamination. Here, we develop a novel statistical model to systematically detect and remove contamination in TCR-seq data. We summarize the observed contamination into two sources, pairwise and cross-cohort. For both sources, we provide visualizations and summary statistics to help users assess the severity of the contamination. Incorporating prior information from 14 existing TCR-seq datasets with minimum contamination, we develop a straightforward Bayesian model to statistically identify contaminated samples. We further provide strategies for removing the impacted sequences to allow for downstream analysis, thus avoiding any need to repeat experiments. Our proposed model shows robustness in contamination detection compared with a few off-the-shelf detection methods in simulation studies. We illustrate the use of our proposed method on two TCR-seq datasets generated locally.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A novel statistical method for decontaminating T-cell receptor sequencing data.

Abstract

Talk to us

Similar Papers

More From: Briefings in bioinformatics

Lead the way for us

Journal: Briefings in bioinformatics	Publication Date: Jun 19, 2023
Citations: 1

Similar Papers

Decision letter: TCR meta-clonotypes for biomarker discovery with tcrdist3 enabled identification of public, HLA-restricted clusters of SARS-CoV-2 TCRs
Benny Chain ... Aleksandra M Walczak
-
Benny Chain, et. al.Benny Chain ... Aleksandra M Walczak
04 May 2021
04 May 2021

Editor's evaluation: TCR meta-clonotypes for biomarker discovery with tcrdist3 enabled identification of public, HLA-restricted clusters of SARS-CoV-2 TCRs
Benny Chain
-
Benny ChainBenny Chain
04 May 2021
04 May 2021

Abstract PR14: Identification of specificity TCR groups of tumor antigen specific T-cells
Liang Chen ... Chunlin Wang
Cancer Immunology Research | VOL. 7
Liang Chen, et. al.Liang Chen ... Chunlin Wang
01 Feb 2019
Cancer Immunology Research | VOL. 7

Profiling of T Cell Repertoire in SARS-CoV-2-Infected COVID-19 Patients Between Mild Disease and Pneumonia.
Che-Mai Chang ... Kang-Yun Lee
Journal of Clinical Immunology | VOL. 41
Che-Mai Chang, et. al.Che-Mai Chang ... Kang-Yun Lee
05 May 2021
Journal of Clinical Immunology | VOL. 41

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel statistical method for decontaminating T-cell receptor sequencing data.

Abstract

Talk to us

Similar Papers

More From: Briefings in bioinformatics