Identification of novel Zybavirus genome sequences and analysis of programmed ribosomal frameshifting motifs in the family Amalgaviridae
The family Amalgaviridae comprises monopartite double-stranded RNA viruses that encode two overlapping open reading frames (ORFs), ORF1 and ORF2. A programmed ribosomal frameshifting (PRF) mechanism facilitates the translation of an ORF1+2p fusion protein. Among the three recognized genera ( Amalgavirus , Unirnavirus , and Zybavirus ), Zybavirus remains poorly characterized, with only one approved species, Zygosaccharomyces bailii virus Z (ZbV-Z), and a few unclassified proposed members. In this study, we identified four novel zybavirus-like viral genome sequences, tentatively named Zygosaccharomyces bailii virus Z2 (ZbV-Z2), Cryptops hortensis-associated virus Z1 (ChaV-Z1), Drosophila suzukii-associated virus Z1 (DsaV-Z1), and Sand Creek Marshes virus Z1 (SCMV-Z1), from publicly available transcriptome datasets. Phylogenetic analysis placed ZbV-Z2, ChaV-Z1, and DsaV-Z1 in a well-supported clade with ZbV-Z and Xisha Islands zybavirus, supporting their classification within Zybavirus . SCMV-Z1 clustered with seven known viruses in a distinct lineage, which may represent a novel genus within the family Amalgaviridae . Comparative analysis of PRF sites in members of Zybavirus , Amalgavirus , and related clades revealed that UUU_CNN may represent a broader and ancestral consensus +1 PRF motif in this group of viruses. Our study highlights the utility of mining public transcriptome data for novel viral genome discovery and contributes to the refinement of both taxonomic classification and conserved genomic features within this viral family.
- Research Article
24
- 10.5423/ppj.nt.11.2017.0243
- Apr 1, 2018
- The Plant Pathology Journal
The genome sequences of two novel monopartite RNA viruses were identified in a common eelgrass (Zostera marina) transcriptome dataset. Sequence comparison and phylogenetic analyses revealed that these two novel viruses belong to the genus Amalgavirus in the family Amalgaviridae. They were named Zostera marina amalgavirus 1 (ZmAV1) and Zostera marina amalgavirus 2 (ZmAV2). Genomes of both ZmAV1 and ZmAV2 contain two overlapping open reading frames (ORFs). ORF1 encodes a putative replication factory matrix-like protein, while ORF2 encodes a RNA-dependent RNA polymerase (RdRp) domain. The fusion protein (ORF1+2) of ORF1 and ORF2, which mediates RNA replication, was produced using the +1 programmed ribosomal frameshifting (PRF) mechanism. The +1 PRF motif sequence, UUU_CGN, which is highly conserved among known amalgaviruses, was also found in ZmAV1 and ZmAV2. Multiple sequence alignment of the ORF1+2 fusion proteins from 24 amalgaviruses revealed that +1 PRF occurred only at three different positions within the 13-amino acid-long segment, which was surrounded by highly conserved regions on both sides. This suggested that the +1 PRF may be constrained by the structure of fusion proteins. Genome sequences of ZmAV1 and ZmAV2, which are the first viruses to be identified in common eelgrass, will serve as useful resources for studying evolution and diversity of amalgaviruses.
- Research Article
11
- 10.4149/av_2018_201
- Jan 1, 2018
- Acta virologica
Amalgaviridae is a family of double-stranded, monosegmented RNA viruses that are associated with plants, fungi, microsporidians, and animals. A sequence contig derived from the transcriptome of a eudicot, Cistus incanus (the family Cistaceae; commonly known as hoary rockrose), was identified as the genome sequence of a novel plant RNA virus and named Cistus incanus RNA virus 1 (CiRV1). Sequence comparison and phylogenetic analysis indicated that CiRV1 is a novel species of the genus Amalgavirus in the family Amalgaviridae. The CiRV1 genome contig has two overlapping open reading frames (ORFs). ORF1 encodes a putative replication factory matrix-like protein, while ORF2 encodes a RNA-dependent RNA polymerase (RdRp) domain. An ORF1+2 fusion protein, which functions in viral RNA replication, is produced by a +1 programmed ribosomal frameshifting (PRF) mechanism. A +1 PRF motif UUU_CGU, which matches the conserved amalgavirus +1 PRF consensus sequence UUU_CGN, was found at the boundary of CiRV1 ORF1 and ORF2. Comparison of 25 amalgavirus ORF1+2 fusion proteins revealed that only three different positions within a 13-amino acid segment were recurrently used at the boundary, possibly being selected so as not to interfere with correct folding and function of the fusion protein. CiRV1 is the first virus found to be associated with the Cistus species and may be useful for studying amalgaviruses.
- Research Article
12
- 10.1007/s13258-019-00782-1
- Jan 16, 2019
- Genes & Genomics
Chia (Salvia hispanica) is a flowering plant in the family Lamiaceae, which produces seeds that are a rich source of various nutritional compounds. To identify a novel RNA virus potentially associated with chia. Transcriptome data obtained from developing chia seeds were assembled into contigs. Sequence contigs containing an open reading frame (ORF) that showed amino acid identities with a viral RNA-dependent RNA polymerase (RdRp) were identified and analyzed. A genomic sequence of a novel plant RNA virus named Salvia hispanica RNA virus 1 (ShRV1) was identified in a chia seed transcriptome dataset. The ShRV1 genome sequence has two ORFs that showed high sequence identities with ORFs of known members of the genus Amalgavirus in the family Amalgaviridae. Amalgaviridae is a family of positive-sense double-stranded non-segmented RNA viruses that infect plants, fungi, and animals. The ShRV1 genome encodes two proteins: a putative replication factory matrix-like protein from ORF1 and an RdRp from the fused ORF of ORF1 and ORF2 by a + 1 programmed ribosomal frameshifting (PRF) mechanism. A conserved + 1 PRF motif sequence UUU_CGU was found at the ORF1/ORF2 boundary. A comparison of 31 amalgavirus ORF1 + 2 fusion proteins revealed that only three positions were repeatedly used as a + 1 PRF site during amalgavirus evolution. ShRV1 is a novelvirus found to be associated with chia and may be useful for studying the molecular features of amalgaviruses.
- Research Article
185
- 10.1006/viro.1994.1034
- Jan 1, 1994
- Virology
Beet Yellows Closterovirus: Complete Genome Structure and Identification of a Leader Papain-like Thiol Protease
- Research Article
15
- 10.3390/v11010081
- Jan 18, 2019
- Viruses
Three RNA viruses—Cucumis melo cryptic virus (CmCV), Cucumis melo amalgavirus 1 (CmAV1), and melon necrotic spot virus (MNSV)—were identified from a melon (Cucumis melo) transcriptome dataset. CmCV has two dsRNA genome segments; dsRNA-1 is 1592 bp in size, containing a conserved RNA-dependent RNA polymerase (RdRp), and dsRNA-2 is 1715 bp in size, and encodes a coat protein (CP). The sequence alignment and phylogenetic analyses of the CmCV RdRp and CP indicated CmCV clusters with approved or putative deltapartitiviruses in well-supported monophyletic clade. The RdRp of CmCV shared an amino acid sequence identity of 60.7% with the closest RdRp of beet cryptic virus 3, and is <57% identical to other partitiviruses. CmAV1 is a nonsegmented dsRNA virus with a genome of 3424 bp, including two partially overlapping open reading frames (ORFs) encoding a putative CP and RdRp. The sequence alignment and phylogenetic analyses of CmAV1 RdRp revealed that it belongs to the genus Amalgavirus in the family Amalgaviridae. The RdRp of CmAV1 shares 57.7% of its amino acid sequence identity with the most closely related RdRp of Phalaenopsis equestris amalgavirus 1, and is <47% identical to the other reported amalgaviruses. These analyses suggest that CmCV and CmAV1 are novel species in the genera Amalgavirus and Deltapartitivirus, respectively. These findings enrich our understanding of new plant dsRNA virus species.
- Research Article
20
- 10.1046/j.1432-1327.2000.01379.x
- Jun 1, 2000
- European journal of biochemistry
The polyprotein of Cocksfoot mottle virus (CfMV) is encoded by two overlapping open reading frames (ORFs). The putative replicase of CfMV is produced as a part of the polyprotein from ORF2b by the -1 ribosomal frameshifting mechanism. The signals leading to -1 ribosomal frameshifting directed by CfMV RNA are the slippery heptamer UUUAAAC and a stem-loop structure starting seven nucleotides downstream from the heptamer. We studied the effect of different parts of the CfMV genome on the -1 ribosomal frameshifting efficiency using a wheat germ extract transcription/translation system. A point mutation in the slippery heptamer and a mutation deleting the stem-loop structure prevented frameshifting. Seventy nucleotides of CfMV sequence, including the slippery sequence and the stem-loop structure, was found to act as a minimal region for frameshifting. Interestingly, a termination codon introduced into the -1-frame 27 nucleotides downstream of the stem-loop structure increased frameshift efficiency threefold, while a similarly located termination codon in the 0-frame had no effect. Even fourfold to fivefold efficiencies were observed when the polyprotein encoding ORFs were fused together, which led simultaneously to the formation of a termination codon downstream of the frameshift signal. Possible reasons underlying these observations are discussed.
- Research Article
115
- 10.1074/jbc.m705676200
- Dec 1, 2007
- Journal of Biological Chemistry
Paternally expressed gene 10 (PEG10) is a mammalian gene that is essential for embryonic development in mice. The gene contains two overlapping open reading frames (ORF1 and ORF2) and is derived from a retroelement that acquired a cellular function. It is not known if both reading frames are required for PEG10 function. Synthesis of ORF2 would be possible only if programmed -1 frameshifting occurred during ORF1 translation. In this study the frameshifting activity of PEG10 was analyzed in vivo, and a potential role for ORF2 was investigated. Phylogenetic analysis demonstrated that PEG10 is highly conserved in therian mammals, with all species retaining the elements necessary for frameshifting as well as functional motifs in each ORF. The frameshift site of PEG10 was highly active in cultured cells and produced the ORF1-2 protein. In mice, endogenous ORF1 and an ORF1-2 frameshift protein were detected in the developing placenta and amniotic membrane from 9.5 days post-coitus through to term with a very high frameshift efficiency (>60%). Mutagenesis of the active site motif of a putative protease within ORF2 showed that this enzyme is active and participates in post-translational processing of PEG10 ORF1-2. Both PEG10 proteins were also detected in first trimester human placenta. By contrast, neither protein expression nor frameshifting was detected in adult mouse tissues. These studies imply that the ORF1-2 protein, synthesized utilizing the most efficient -1 frameshift mechanism yet documented in vivo, will have an essential function that is intrinsic to the importance of PEG10 in mammals.
- Research Article
62
- 10.1006/viro.1995.1118
- Mar 1, 1995
- Virology
The Putative Replicase of the Cocksfoot Mottle Sobemovirus Is Translated as a Part of the Polyprotein by -1 Ribosomal Frameshift
- Research Article
6
- 10.1016/0042-6822(90)90358-x
- Oct 1, 1990
- Virology
The hepatitis B virus X-C fusion protein is unlikely to be produced by the mechanism of ribosomal frameshifting
- Research Article
38
- 10.1006/viro.1993.1592
- Nov 1, 1993
- Virology
Transcriptional Analysis of the Virion-Sense Genes of the Geminivirus Beet Curly Top Virus
- Research Article
39
- 10.1099/0022-1317-68-8-2117
- Aug 1, 1987
- The Journal of general virology
The nucleotide sequence of bovine papillomavirus type 4 (BPV-4) was determined. The viral genome is 7261 base pairs long. Several overlapping open reading frames (ORFs) have been identified both on the basis of amino acid comparison with other papillomaviruses and on their transcriptional pattern. Eight early ORFs (E1 to 8) were recognized, coding for DNA replication and cell transformation functions, and three late ORFs (L1 to 3), coding for structural proteins. Like the E5 ORF of human papillomavirus type 6 the E5 ORF of BPV-4 is discontinuous. Unlike other papillomaviruses, the non-coding region upstream of the early ORFs (ncr-1) is short (385 base pairs), but there is another non-coding region (ncr-2) of nearly 500 base pairs between the L2 and L1 ORFs. Most of the putative regulatory sites are located in the ncr-1, although potential controlling elements are also found in other parts of the genome. Polyadenylation sites are present at the 3' end of both the early and the late transcription units. Comparison between the polypeptides of BPV-4 and other papillomaviruses showed that BPV-4 is evolutionarily closer to the epitheliotropic human and rabbit viruses than to BPV-1.
- Research Article
23
- 10.1128/mcb.12.9.4242-4248.1992
- Sep 1, 1992
- Molecular and cellular biology
The genomic structure of the rat LINE (L1Rn) DNA element contains two overlapping open reading frames (ORFs) and apparently has a potential to code for a DNA/RNA-binding protein (in ORF1) and a reverse transcriptase (in ORF2). We have characterized a 1,630-bp L1Rn cDNA clone encompassing the overlapping ORFs and a 600-bp genomic fragment derived from a full-length L1Rn member and containing the beginning of ORF1. These DNAs were used to restore in part the ORF1-ORF2 organization of L1Rn after being cloned into the pSP65 vector under the control of SP6 polymerase promoter. To test whether L1Rn ORF1 and ORF2 are expressed as a fusion protein, a series of capped RNAs with progressive truncations containing one or both ORFs were prepared and translated in the rabbit reticulocyte lysate. Our analysis indicates that the expression of a putative reverse transcriptase-encoded L1Rn ORF2 in vitro is regulated by reinitiation or internal initiation of translation but not by ribosomal frameshifting.
- Research Article
26
- 10.1128/mcb.12.9.4242
- Sep 1, 1992
- Molecular and Cellular Biology
The genomic structure of the rat LINE (L1Rn) DNA element contains two overlapping open reading frames (ORFs) and apparently has a potential to code for a DNA/RNA-binding protein (in ORF1) and a reverse transcriptase (in ORF2). We have characterized a 1,630-bp L1Rn cDNA clone encompassing the overlapping ORFs and a 600-bp genomic fragment derived from a full-length L1Rn member and containing the beginning of ORF1. These DNAs were used to restore in part the ORF1-ORF2 organization of L1Rn after being cloned into the pSP65 vector under the control of SP6 polymerase promoter. To test whether L1Rn ORF1 and ORF2 are expressed as a fusion protein, a series of capped RNAs with progressive truncations containing one or both ORFs were prepared and translated in the rabbit reticulocyte lysate. Our analysis indicates that the expression of a putative reverse transcriptase-encoded L1Rn ORF2 in vitro is regulated by reinitiation or internal initiation of translation but not by ribosomal frameshifting.
- Research Article
30
- 10.1099/0022-1317-81-11-2783
- Nov 1, 2000
- Journal of General Virology
The polyprotein of Cocksfoot mottle virus (CfMV; genus SOBEMOVIRUS:) is translated from two overlapping open reading frames (ORFs) 2a and 2b by a -1 ribosomal frameshifting mechanism. In this study, a 12 kDa protein was purified from viral RNA-derived samples that appears to correspond to the CfMV genome-linked protein (VPg). According to the determined N-terminal amino acid sequence, the VPg domain is located between the serine proteinase and replicase motifs and the N terminus of VPg is cleaved from the polyprotein between glutamic acid and asparagine residues. Western blot analysis of infected plant material showed that the polyprotein is processed at several additional sites. An antiserum against the ORF 2a product recognized six distinct proteins, whereas, of these, the VPg antiserum clearly recognized only a 24 kDa protein. This indicates that the fully processed 12 kDa VPg detected in viral RNA-derived samples is a minor product in infected plants. An antiserum against the ORF 2b product recognized a 58 kDa protein, which indicates that the fully processed replicase is entirely or almost entirely encoded by ORF 2b. The origin of the detected cleavage products and a proposed polyprotein processing model are discussed.
- Research Article
14
- 10.1371/journal.pone.0066211
- Jun 24, 2013
- PLoS ONE
Overlapping open reading frames (ORFs) in viral genomes undergo co-evolution; however, how individual amino acids coded by overlapping ORFs are structurally, functionally, and co-evolutionarily constrained remains difficult to address by conventional homologous sequence alignment approaches. We report here a new experimental and computational evolution-based methodology to address this question and report its preliminary application to elucidating a mode of co-evolution of the frame-shifted overlapping ORFs in the adeno-associated virus (AAV) serotype 2 viral genome. These ORFs encode both capsid VP protein and non-structural assembly-activating protein (AAP). To show proof of principle of the new method, we focused on the evolutionarily conserved QVKEVTQ and KSKRSRR motifs, a pair of overlapping heptapeptides in VP and AAP, respectively. In the new method, we first identified a large number of capsid-forming VP3 mutants and functionally competent AAP mutants of these motifs from mutant libraries by experimental directed evolution under no co-evolutionary constraints. We used Illumina sequencing to obtain a large dataset and then statistically assessed the viability of VP and AAP heptapeptide mutants. The obtained heptapeptide information was then integrated into an evolutionary algorithm, with which VP and AAP were co-evolved from random or native nucleotide sequences in silico. As a result, we demonstrate that these two heptapeptide motifs could exhibit high degeneracy if coded by separate nucleotide sequences, and elucidate how overlap-evoked co-evolutionary constraints play a role in making the VP and AAP heptapeptide sequences into the present shape. Specifically, we demonstrate that two valine (V) residues and β-strand propensity in QVKEVTQ are structurally important, the strongly negative and hydrophilic nature of KSKRSRR is functionally important, and overlap-evoked co-evolution imposes strong constraints on serine (S) residues in KSKRSRR, despite high degeneracy of the motifs in the absence of co-evolutionary constraints.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.