Abstract

At least 8% of the human genome was formed by integration of retroviral DNA sequences. Here we analyze the forces directing the accumulation of human endogenous retroviruses (HERVs) by comparing de novo HERV integration targeting with the distribution of fixed HERV elements in the human genome. All known genomic HERVs are inactive due to mutation, but we were able to study integration targeting using a reconstituted consensus HERV-K (designated HERV-K(Con)). We found that HERV-K(Con) integrated preferentially in transcription units, in gene-rich regions, and near features associated with active transcription units and associated regulatory regions. In contrast, genomic HERV-K proviruses are found preferentially outside transcription units. The minority of genomic HERVKs present inside transcription units are in opposite transcriptional orientation relative to the host gene, the orientation predicted to be minimally disruptive to host mRNA synthesis, but de novo HERV-K(Con) integration within transcription units showed no orientation bias. We also found that the youngest HERV-K elements in the human genome showed a distribution intermediate between de novo HERV-K(Con) integration sites and older fixed HERV-Ks. These findings indicate that accumulation of HERVs in the human germline is a two-step process: integration targeting biases direct initial accumulation, then purifying selection leads to loss of proviruses disrupting gene function.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call