Abstract
Aiming at the problems of parameter optimization and insufficient utilization of split reads in the detection for copy number variation (CNV), a new definition of relative read depth (RRD) and a randomized sampling strategy (RGN) are proposed in this paper. Compared to the raw read depth, the RRD parameter has weak correlation with GC content, mappability and the width of analysis windows tiled along the genome. The RGN strategy is based on the weighted sampling strategy which can speed up the read count data analysis. Subsequently, we propose an improved detection algorithm for CNV based on hidden Markov model (CNV-HMM). The HMM detects the abnormal signal of read count data and outputs the detection results of candidate CNVs. At the end of the algorithm, we filter out the results of candidate CNVs using the split reads to improve the performance of CNV-HMM algorithm. Finally, the experiment results show that our CNV-HMM algorithm has higher sensitivity and accuracy for CNVs detection than most of current detection algorithms and applicative both for diploid animal and plant.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.