Abstract
In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species.Database URL: http://geve.med.u-tokai.ac.jp
Highlights
10% of mammalian genome sequences correspond to endogenous viral elements (EVEs), including endogenous retroviruses (ERVs), which are thought to be derived from ancient viral infections of germ cells [1,2,3,4]
To retrieve EVEs that are missed by the two computational programs, we performed similarity searches using BLAT [29] against each genome (Figure 1B, STEP3) using the following amino acid sequences: (i) all viral sequences encoding proteins stored in the NCBI RefSeq database, (ii) 131 known EVE genes and (iii) all 774 172 EVE sequences identified in the STEP 2
We summarized EVE open reading frames (ORFs) sequences with viral motifs and encoding >80 amino acids by removing overlapping sequences while accounting for reading frames (Figure 1B, STEP 4)
Summary
10% of mammalian genome sequences correspond to endogenous viral elements (EVEs), including endogenous retroviruses (ERVs), which are thought to be derived from ancient viral infections of germ cells [1,2,3,4]. For HERVd (http://herv.img.cas.cz), the reference human genome sequence is out of date, and the database is apparently not maintained, as its last update was on September 19, 2003. Neither database provides ORFs for each EVE sequence.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.