Abstract

The implication of viruses in human cancers, as well as the emergence of next generation sequencing has permitted to investigate further their role and pathophysiology in the development of this disease. One such mechanism is the integration of portions of viral genomes in the human genome, as well as the specific action of viral oncogenes.inding integration sites and preserved oncogenes is still relying on heavy manual intervention. We developed an analysis and interpretation pipeline to determine viral insertions. Using data from directed viral capture, the pipeline conducts a crude genotyping phase to select reference viral genomes, identifies chimeric reads, extracts the putative human sequences to locate in the human reference genome, scores and ranks candidate junctions, and exports tabular and visual results. We leverage common bioinformatics tools (bowtie2, samtools, blat), and a dedicated filtering and ranking algorithm, implemented in R, to infer candidate junctions and insertions. Static results (tables, figures) are produced, as well as an interactive interpretation tool developed as a shiny web app. We validated this pipeline against published results of HPV, HBV, and AAV2 insertions and show good information retrieval.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call