Abstract

BackgroundViral integration into a host genome is defined by two chimeric junctions that join viral and host DNA. Recently, computational tools have been developed that utilize NGS data to detect chimeric junctions. These methods identify individual viral-host junctions but do not associate chimeric pairs as an integration event. Without knowing the chimeric boundaries of an integration, its genetic content cannot be determined.ResultsSummonchimera is a Perl program that associates chimera pairs to infer the complete viral genomic integration event to the nucleotide level within single or paired-end NGS data. SummonChimera integration prediction was verified on a set of single-end IonTorrent reads from a purified Salmonella bacterium with an integrated bacteriophage. Furthermore, SummonChimera predicted integrations from experimentally verified Hepatitis B Virus chimeras within a paired-end Whole Genome Sequencing hepatocellular carcinoma tumor database.ConclusionsSummonChimera identified all experimentally verified chimeras detected by current computational methods. Further, SummonChimera integration inference precisely predicted bacteriophage integration. The application of SummonChimera to cancer NGS accurately identifies deletion of host and viral sequence during integration. The precise nucleotide determination of an integration allows prediction of viral and cellular gene transcription patterns.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-014-0348-4) contains supplementary material, which is available to authorized users.

Highlights

  • Viral integration into a host genome is defined by two chimeric junctions that join viral and host DNA

  • The remaining 37 reads are individual chimeras and are likely artifacts of the sequencing process based on lack of chimeric junction coverage

  • Each ambiguous region had single nucleotide variations which likely resulted from sequencing error neighboring the chimeric junction

Read more

Summary

Introduction

Computational tools have been developed that utilize NGS data to detect chimeric junctions These methods identify individual viral-host junctions but do not associate chimeric pairs as an integration event. There is a growing amount of Generation sequencing (NGS) data available for cancer genomes [4,5,6,7] These datasets allow for massive viral integration analysis in which single virus and host chimeras have been identified and compared [5,6,7,8,9,10]. The complete mapping of viral integrations requires the association of two chimeric sequences representing the two virus-host junctions present in each integration event. Mapping and association of both virus-host junctions allows the identification of viral and host sequences retained and lost during integration

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call