Abstract
Identification of viral integration sites has been important in understanding the pathogenesis and progression of diseases associated with particular viral infections. The advent of next-generation sequencing (NGS) has enabled researchers to understand the impact that viral integration has on the host, such as tumorigenesis. Current computational methods to analyze NGS data of virus-host junction sites have been limited in terms of their accessibility to a broad user base. In this study, we developed a software application (named ChimericSeq), that is the first program of its kind to offer a graphical user interface, compatibility with both Windows and Mac operating systems, and optimized for effectively identifying and annotating virus-host chimeric reads within NGS data. In addition, ChimericSeq’s pipeline implements custom filtering to remove artifacts and detect reads with quantitative analytical reporting to provide functional significance to discovered integration sites. The improved accessibility of ChimericSeq through a GUI interface in both Windows and Mac has potential to expand NGS analytical support to a broader spectrum of the scientific community.
Highlights
Many viruses, such as the hepatitis B virus (HBV), integrate into host genomes causing genomic disruption and instability [1,2]
In addition to the easy installation, ChimericSeq enables errorchecking of processing parameters, provides analytical annotation features, and is able to deal with memory limitation via file-splitting
The reduction of computational dependencies and use of Bowtie2’s implementation of algorithms internally allows for a more efficient performance. These improvements were demonstrated in a comparison between ChimericSeq and other current programs to identify chimeric reads using a synthetic data set we created, next generation sequencing (NGS) data generated from three HCC tissue DNA samples, and a whole genome sequencing (WGS) dataset provided by VirusFinder2 containing an HPV integration event [9]
Summary
Many viruses, such as the hepatitis B virus (HBV), integrate into host genomes causing genomic disruption and instability [1,2]. These integration events may facilitate progression of consequential disease states, such as cancer, suggesting the identification and characterization of virus-host integration sites can provide important insights into tumorigenesis [3,4]. With an increasing amount of generation sequencing (NGS) data being generated, efficient and sensitive tools that are accessible to a broader user base are needed to facilitate NGS data analysis [5,6]. Ongoing investigation of virus-host integration sites has been limited by the scopes of currently available tools for NGS analysis of chimeric reads.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.