Abstract

BackgroundHuman endogenous retroviruses (HERVs) are surviving traces of ancient retrovirus infections and now reside within the human DNA. Recently HERV expression has been detected in both normal tissues and diseased patients. However, the activities (expression levels) of individual HERV sequences are mostly unknown.ResultsWe introduce a generative mixture model, based on Hidden Markov Models, for estimating the activities of the individual HERV sequences from EST (expressed sequence tag) databases. We use the model to estimate the relative activities of 181 HERVs. We also empirically justify a faster heuristic method for HERV activity estimation and use it to estimate the activities of 2450 HERVs. The majority of the HERV activities were previously unknown.Conclusion(i) Our methods estimate activity accurately based on experiments on simulated data. (ii) Our estimate on real data shows that 7% of the HERVs are active. The active ones are spread unevenly into HERV groups and relatively uniformly in terms of estimated age. HERVs with the retroviral env gene are more often active than HERVs without env. Few of the active HERVs have open reading frames for retroviral proteins.

Highlights

  • Human endogenous retroviruses (HERVs) are surviving traces of ancient retrovirus infections and reside within the human DNA

  • We make the Hidden Markov Model (HMM) training time reasonable by applying two shortcuts. (i) Only HERV-expressed sequence tags (ESTs) pairs returned by BLAST are used. (ii) We introduce the restriction that the EST can only match the HERV sequence in the immediate vicinity of the BLAST match

  • Simulated data Sequences within a HERV group are very similar in sequence, and the differences are larger between groups

Read more

Summary

Introduction

Human endogenous retroviruses (HERVs) are surviving traces of ancient retrovirus infections and reside within the human DNA. The activities (expression levels) of individual HERV sequences are mostly unknown. Human endogenous retroviruses (HERVs) are surviving traces of ancient infections by retroviruses that have become fixed to human DNA. If ancient highly mutated elements are included, HERV sequences form 8% of the human genome [1]. HERVs are DNA sequences with a typical retroviral structure. The rest, the internal part, of the HERV consists of 4 retroviral genes: gag, pro, pol and env. A functional, active HERV can transcribe its genes and produce retroviral proteins. These proteins enable the HERV to move and copy (page number not for citation purposes)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call