537. New Graph-Based Algorithm for Comprehensive Identification and Tracking Retroviral Integration Sites

Andrea Calabria,Stefano Beretta,Ivan Merelli,Giulio Spinozzi,Stefano Brasca,Fabrizio Benedicenti,Erika Tenderini,Alessandra Biffi,Eugenio Montini

doi:10.1016/s1525-0016(16)33345-7

Abstract

Vector integration sites (IS) in hematopoietic stem cell (HSC) gene therapy (GT) applications are stable genetic marks, distinctive for each independent cell clone and its progeny. The characterization of IS allows to identify each cell clone and individually track its fate in different tissues or cell lineages and during time, and is required for assessing the safety and efficacy of the treatment. Bioinformatics pipelines for IS detection used in GT identify the sequence reads mapping in the same genomic position of the reference genome as a single IS but discard those ambiguously mapped in multiple genomic regions. The loss of such significant portion of patients’ IS may hide potential malignant events thus reducing the reliability of IS studies. We developed a novel tool that is able to accurately identify IS in any genomic region even if composed by repetitive genomic sequences. Our approach exploits an initial genome free analysis of sequencing reads by creating an undirected graph in which nodes are the input sequences and edges represent valid alignments (over a specific identity threshold) between pairs of nodes. Through the analysis and decomposition of the graph, the method identifies indivisible subgraphs of sequences (clusters), each of them corresponding to an IS. Once extracted the consensus sequence of the clusters and aligned on the reference genome, we collect the alignment results and the annotation labels from RepeatMasker. By combining the set of genomic coordinates and the annotation labels, the method retraces the initial sequence graph, statistically validates the clusters through permutation test and produces the final list of IS. We tested the reliability of our tool on 3 IS datasets generated from simulated sequencing reads with incremental rate of nucleotide variations (0%, 0.25% and 0.5%) and real data from a cell line with known IS and we compared out tool to VISPA and UClust, used for GT studies. In the simulated datasets our tool demonstrated precision and recall ranging 0.85-0.97 and 0.88-0.99 respectively, producing the aggregate F-score ranging 0.86-0.98 which resulted higher than VISPA and UClust. In the experimental case of sequences from LAM-PCR products, our tool and VISPA were able to identify all the 6 known ISs for >98% of the reads produced, while UClust identified only 5 out 6 ISs. We then used our tool to reanalyze the sequencing reads of our GT clinical trial for Metachromatic Leukodystrophy (MLD) completing the hidden portion of IS. The overall number of ISs, sequencing reads and estimated actively re-populating HSCs was increased by an average fold ~1.5 with respect the previously published data obtained through VISPA whereas the diversity index of the population did not change and no aberrant clones in repeats occurred. Our tool addresses and solves important open issues in retroviral IS identification and clonal tracking, allowing the generation of a comprehensive repertoire of IS.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

537. New Graph-Based Algorithm for Comprehensive Identification and Tracking Retroviral Integration Sites

Abstract

Talk to us

Similar Papers

More From: Molecular Therapy

Lead the way for us

Journal: Molecular Therapy	Publication Date: May 1, 2016
License type: cc-by-nc-nd

Similar Papers

Integration profile of retroviral vector in gene therapy treated patients is cell‐specific according to gene expression and chromatin conformation of target cell
Luca Biasco ... Alessandro Aiuti
EMBO Molecular Medicine | VOL. 3
Luca Biasco, et. al.Luca Biasco ... Alessandro Aiuti
17 Jan 2011
EMBO Molecular Medicine | VOL. 3

Integration Frequency and Intermolecular Recombination of rAAV Vectors in Non-human Primate Skeletal Muscle and Liver
Ali Nowrouzi ... Manfred Schmidt
Molecular Therapy | VOL. 20
Ali Nowrouzi, et. al.Ali Nowrouzi ... Manfred Schmidt
01 Jun 2012
Molecular Therapy | VOL. 20

Comprehensive Clonal Mapping of Hematopoiesis in Vivo in Humans By Retroviral Vector Insertional Barcoding
Luca Biasco ... Alessandro Aiuti
Blood | VOL. 124
Luca Biasco, et. al.Luca Biasco ... Alessandro Aiuti
06 Dec 2014
Blood | VOL. 124

VISPA2: a scalable pipeline for high-throughput identification and annotation of vector integration sites
Giulio Spinozzi ... Eugenio Montini
BMC Bioinformatics | VOL. 18
Giulio Spinozzi, et. al.Giulio Spinozzi ... Eugenio Montini
25 Nov 2017
BMC Bioinformatics | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

537. New Graph-Based Algorithm for Comprehensive Identification and Tracking Retroviral Integration Sites

Abstract

Talk to us

Similar Papers

More From: Molecular Therapy