The host chromatin-binding factor LEDGF/p75 interacts with HIV-1 integrase and directs integration to active transcription units. To understand how LEDGF/p75 recognizes transcription units, we sequenced 1 million HIV-1 integration sites isolated from cultured HEK293T cells. Analysis of integration sites showed that cancer genes were preferentially targeted, raising concerns about using lentivirus vectors for gene therapy. Additional analysis led to the discovery that introns and alternative splicing contributed significantly to integration site selection. These correlations were independent of transcription levels, size of transcription units, and length of the introns. Multivariate analysis with five parameters previously found to predict integration sites showed that intron density is the strongest predictor of integration density in transcription units. Analysis of previously published HIV-1 integration site data showed that integration density in transcription units in mouse embryonic fibroblasts also correlated strongly with intron number, and this correlation was absent in cells lacking LEDGF. Affinity purification showed that LEDGF/p75 is associated with a number of splicing factors, and RNA sequencing (RNA-seq) analysis of HEK293T cells lacking LEDGF/p75 or the LEDGF/p75 integrase-binding domain (IBD) showed that LEDGF/p75 contributes to splicing patterns in half of the transcription units that have alternative isoforms. Thus, LEDGF/p75 interacts with splicing factors, contributes to exon choice, and directs HIV-1 integration to transcription units that are highly spliced.
Read full abstract