Abstract

Fundamental principles of HIV-1 integration into the human genome have been revealed in the past 2 decades. However, the impact of the integration site on proviral transcription and expression remains poorly understood. Solving this problem requires the analysis of multiple genomic datasets for thousands of proviral integration sites. Here, we generated and combined large-scale datasets, including epigenetics, transcriptome, and 3-dimensional genome architecture to interrogate the chromatin states, transcription activity, and nuclear sub-compartments around HIV-1 integrations in Jurkat CD4+ T cells to decipher human genome regulatory features shaping the transcription of proviral classes based on their position and orientation in the genome. Through a Hidden Markov Model and ranked informative values prior to a machine learning logistic regression model, we defined nuclear sub-compartments and chromatin states contributing to genomic architecture, transcriptional activity, and nucleosome density of regions neighboring the integration site, as additive features influencing HIV-1 expression. Our integrated genomics approach also allows for a robust experimental design, in which HIV-1 can be genetically introduced into precise genomic locations with known regulatory features to assess the relationship of integration positions to viral transcription and fate.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call