Abstract

The human cytomegalovirus (HCMV) is a ubiquitous, human pathogenic herpesvirus. The complete viral genome is transcriptionally active during infection; however, a large part of its transcriptome has yet to be annotated. In this work, we applied the amplified isoform sequencing technique from Pacific Biosciences to characterize the lytic transcriptome of HCMV strain Towne varS. We developed a pipeline for transcript annotation using long-read sequencing data. We identified 248 transcriptional start sites, 116 transcriptional termination sites and 80 splicing events. Using this information, we have annotated 291 previously undescribed or only partially annotated transcript isoforms, including eight novel antisense transcripts and their isoforms, as well as a novel transcript (RS2) in the short repeat region, partially antisense to RS1. Similarly to other organisms, we discovered a high transcriptional diversity in HCMV, with many transcripts only slightly differing from one another. Comparing our transcriptome profiling results to an earlier ribosome footprint analysis, we have concluded that the majority of the transcripts contain multiple translationally active ORFs, and also that most isoforms contain unique combinations of ORFs. Based on these results, we propose that one important function of this transcriptional diversity may be to provide a regulatory mechanism at the level of translation.

Highlights

  • In herpesviruses is relatively rare[19], over 100 splice junctions have been described in HCMV5,7,12 – many of which are alternatively spliced

  • The American Type Culture Collection (ATCC) stock is reported to contain an intact variant, no sequencing reads mapped to the UL133-145 region, nor could this region be amplified by a specific primer sequence

  • We did not detect the intact varL variant, which is reported to be present in the ATCC human cytomegalovirus (HCMV) strain Towne virus stock

Read more

Summary

Introduction

In herpesviruses is relatively rare[19], over 100 splice junctions have been described in HCMV5,7,12 – many of which are alternatively spliced. Ribosome footprint analysis has identified hundreds of translationally active short ORFs in the HCMV genome[7], which demonstrates that many transcripts express uORFs beside the main ORFs, and may implicate that even transcripts that were previously thought to be non-coding, could possibly have coding potential as polycistronic peptide coding RNAs (ppcRNAs). Short-read sequencing analysis has demonstrated that the complete HCMV genome is transcriptionally active during lytic infection[5] It has revealed numerous splice sites, and confirmed many of the previously detected. Among the major shortcomings of long-read sequencing methods are their low throughput and relatively high rate of error[31] While the latter does not typically pose a challenge in transcriptomic analyses, the low coverage of larger genomes means that the analysis is at a greater disposition for picking up erroneous signals. Our focus was to identify novel transcripts, transcript isoforms, novel splice junctions, and to determine the coding potential of these transcripts, by comparing these to available ribosome profiling data

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call