AbstractIntegration of retroviral vectors in the human genome follows nonrandom patterns that favor insertional deregulation of gene expression and increase the risk of their use in clinical gene therapy. The molecular basis of retroviral target site selection is still poorly understood. We used deep sequencing technology to build genomewide, high-definition maps of > 60 000 integration sites of Moloney murine leukemia virus (MLV)– and HIV-based retroviral vectors in the genome of human CD34+ multipotent hematopoietic progenitor cells (HPCs) and used gene expression profiling, chromatin immunoprecipitation, and bioinformatics to associate integration to genetic and epigenetic features of the HPC genome. Clusters of recurrent MLV integrations identify regulatory elements (alternative promoters, enhancers, evolutionarily conserved noncoding regions) within or around protein-coding genes and microRNAs with crucial functions in HPC growth and differentiation, bearing epigenetic marks of active or poised transcription (H3K4me1, H3K4me2, H3K4me3, H3K9Ac, Pol II) and specialized chromatin configurations (H2A.Z). Overall, we mapped 3500 high-frequency integration clusters, which represent a new resource for the identification of transcriptionally active regulatory elements. High-definition MLV integration maps provide a rational basis for predicting genotoxic risks in gene therapy and a new tool for genomewide identification of promoters and regulatory elements controlling hematopoietic stem and progenitor cell functions.
Read full abstract