Abstract

Sequence Independent Single Primer Amplification is one of the most widely used random amplification approaches in virology for sequencing template preparation. This technique relies on oligonucleotides consisting of a 3′ random part used to prime complementary DNA synthesis and a 5′ defined tag sequence for subsequent amplification. Recently, this amplification method was combined with next generation sequencing to obtain viral sequences. However, these studies showed a biased distribution of the resulting sequence reads over the analyzed genomes. The aim of this study was to elucidate the mechanisms that lead to biased sequence depth when using random amplification. Avian paramyxovirus type 8 was used as a model RNA virus to investigate these mechanisms. We showed, based on in silico analysis of the sequence depth in relation to GC-content, predicted RNA secondary structure and sequence complementarity to the 3′ part of the tag sequence, that the tag sequence has the main contribution to the observed bias in sequence depth. We confirmed this finding experimentally using both fragmented and non-fragmented viral RNAs as well as primers differing in random oligomer length (6 or 12 nucleotides) and in the sequence of the amplification tag. The observed oligonucleotide annealing bias can be reduced by extending the random oligomer sequence and by in silico combining sequence data from SISPA experiments using different 5′ defined tag sequences. These findings contribute to the optimization of random nucleic acid amplification protocols that are currently required for downstream applications such as viral metagenomics and microarray analysis.

Highlights

  • The determination of complete viral genome sequences is a growing field in human, animal, and plant virology

  • Using the 6N Sequence Independent Single Primer Amplification (SISPA) primer FR20RV-6N [16], the complete coding sequence of Avian paramyxovirus type 8 virus (APMV-8) could be determined using a reference assembly with approximately 7.7 Mb of raw data. This 7.7 Mb of raw reads were randomly picked from the complete dataset and corresponds to about 500 6sequence depth under the assumption of even sequence depth along the genome (Table 1)

  • Despite the median sequencing depth of 326.5 6, extreme variation in sequence depth (1 to 3 286 6) was observed (Table 1; Figure 1 A, repeat 1). 23% of the genome nucleotides were covered less than 100 times, which we set as a minimum sequence depth to allow quantitative variant analysis (Table 1)

Read more

Summary

Introduction

The determination of complete viral genome sequences is a growing field in human, animal, and plant virology. Efficient sequencing approaches rely very much on prior sequence knowledge and are often focused on specific groups of viruses to allow for robust design of amplification primers (e.g.[2]). Examples include streamlined sequencing protocols for influenza A viruses [5,6], classical swine fever virus [7] and foot-and-mouth disease virus [8]. These protocols allow completion of the viral genome(s) in a single experiment and provide sufficient sequencing depth to analyze the variability of RNA virus populations in a single sample These protocols allow completion of the viral genome(s) in a single experiment and provide sufficient sequencing depth to analyze the variability of RNA virus populations in a single sample (e.g. [9,10])

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call