Abstract

ABSTRACTOne lineage of human endogenous retroviruses (HERVs), HERV-K(HML2), is upregulated in many cancers, some autoimmune/inflammatory diseases, and HIV-infected cells. Despite 3 decades of research, it is not known if these viruses play a causal role in disease, and there has been recent interest in whether they can be used as immunotherapy targets. Resolution of both these questions will be helped by an ability to distinguish between the effects of different integrated copies of the virus (loci). Research so far has concentrated on the 20 or so recently integrated loci that, with one exception, are in the human reference genome sequence. However, this viral lineage has been copying in the human population within the last million years, so some loci will inevitably be present in the human population but absent from the reference sequence. We therefore performed the first detailed search for such loci by mining whole-genome sequences generated by next-generation sequencing. We found a total of 17 loci, and the frequency of their presence ranged from only 2 of the 358 individuals examined to over 95% of them. On average, each individual had six loci that are not in the human reference genome sequence. Comparing the number of loci that we found to an expectation derived from a neutral population genetic model suggests that the lineage was copying until at least ∼250,000 years ago.IMPORTANCE About 5% of the human genome sequence is composed of the remains of retroviruses that over millions of years have integrated into the chromosomes of egg and/or sperm precursor cells. There are indications that protein expression of these viruses is higher in some diseases, and we need to know (i) whether these viruses have a role in causing disease and (ii) whether they can be used as immunotherapy targets in some of them. Answering both questions requires a better understanding of how individuals differ in the viruses that they carry. We carried out the first careful search for new viruses in some of the many human genome sequences that are now available thanks to advances in sequencing technology. We also compared the number that we found to a theoretical expectation to see if it is likely that these viruses are still replicating in the human population today.

Highlights

  • One lineage of human endogenous retroviruses (HERVs), HERV-K(HML2), is upregulated in many cancers, some autoimmune/ inflammatory diseases, and HIV-infected cells

  • There are indications that protein expression of these viruses is higher in some diseases, and we need to know (i) whether these viruses have a role in causing disease and (ii) whether they can be used as immunotherapy targets in some of them

  • Another study [21] presented 31 putative loci. Only of these are genuine unfixed loci; the other represent a variety of mining artifacts. Eleven of these 15 loci were present in our The Cancer Genome Atlas (TCGA) patient genomes, and the other 4, all of which were present at low frequencies in that other study [21], were found to be present in our larger WGS500 sample

Read more

Summary

Introduction

One lineage of human endogenous retroviruses (HERVs), HERV-K(HML2), is upregulated in many cancers, some autoimmune/ inflammatory diseases, and HIV-infected cells. Despite 3 decades of research, it is not known if these viruses play a causal role in disease, and there has been recent interest in whether they can be used as immunotherapy targets Resolution of both these questions will be helped by an ability to distinguish between the effects of different integrated copies of the virus (loci). RNA transcription and protein expression of HK2 and other ERVs are elevated in many cancers, some autoimmune/inflammatory diseases, and HIV infection, and there has been a long and unresolved search for a causal role in disease [5,6,7] This elevation of protein expression in disease has led to research into their potential as immunotherapy targets for cancer and HIV treatment [8,9,10,11,12]. Research has been based on our knowledge of loci that are in the human reference genome plus the one full-length locus that is known to be in the human population but is not in the reference, K113 [20]

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.