A Density Peak-Based Method to Detect Copy Number Variations From Next-Generation Sequencing Data.

Kun Xie,Ye Tian,Xiguo Yuan

doi:10.3389/fgene.2020.632311

Kun Xie, Ye Tian + Show 1 more

Open Access

https://doi.org/10.3389/fgene.2020.632311

Copy DOI

Journal: Frontiers in Genetics	Publication Date: Jan 13, 2021
Citations: 5	License type: CC BY 4.0

Affiliation: Xidian University

Abstract

Copy number variation (CNV) is a common type of structural variations in human genome and confers biological meanings to human complex diseases. Detection of CNVs is an important step for a systematic analysis of CNVs in medical research of complex diseases. The recent development of next-generation sequencing (NGS) platforms provides unprecedented opportunities for the detection of CNVs at a base-level resolution. However, due to the intrinsic characteristics behind NGS data, accurate detection of CNVs is still a challenging task. In this article, we propose a new density peak-based method, called dpCNV, for the detection of CNVs from NGS data. The algorithm of dpCNV is designed based on density peak clustering algorithm. It extracts two features, i.e., local density and minimum distance, from sequencing read depth (RD) profile and generates a two-dimensional data. Based on the generated data, a two-dimensional null distribution is constructed to test the significance of each genome bin and then the significant genome bins are declared as CNVs. We test the performance of the dpCNV method on a number of simulated datasets and make comparison with several existing methods. The experimental results demonstrate that our proposed method outperforms others in terms of sensitivity and F1-score. We further apply it to a set of real sequencing samples and the results demonstrate the validity of dpCNV. Therefore, we expect that dpCNV can be used as a supplementary to existing methods and may become a routine tool in the field of genome mutation analysis.

Highlights

Copy number variation (CNV) is an important category of DNA structural variations, including amplifications or losses of DNA fragments with a length of more than 1 kilo base-pairs (Freeman et al, 2006; Yuan et al, 2012b)
It carefully takes into account that CNV regions usually accounts for a small fraction of the whole genome and many CNVs just display a “local” outlier state, and extracts two related features from the read depth (RD) profile based on the density peak algorithm (Rodriguez and Laio, 2014)
We apply the proposed method to a set of real sequencing samples, which have been obtained from the European Genome-phenome Archive (EGA) databases

Summary

Introduction

Copy number variation (CNV) is an important category of DNA structural variations, including amplifications or losses of DNA fragments with a length of more than 1 kilo base-pairs (bp) (Freeman et al, 2006; Yuan et al, 2012b). CNV is one of the important pathogenic factors affecting human complex diseases (Shlien and Malkin, 2009; Fridley et al, 2012; Xi et al, 2020a,b). It is necessary and meaningful to analyze CNVs when studying and treating complex diseases especially human cancers. The mechanisms for the formation of CNVs can be classified into two categories: DNA recombination and DNA error replication (Martin et al, 2019). In each category of the mechanisms, CNVs are usually presented in either amplification or deletion states. The major step of CNV analysis in samples obtained from human

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Density Peak-Based Method to Detect Copy Number Variations From Next-Generation Sequencing Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics

Lead the way for us

Similar Papers

KNNCNV: A K-Nearest Neighbor Based Method for Detection of Copy Number Variations Using NGS Data.
Kun Xie ... Yuehui Chen
Frontiers in cell and developmental biology | VOL. 9
Kun Xie, et. al.Kun Xie ... Yuehui Chen
22 Dec 2021
Frontiers in cell and developmental biology | VOL. 9

DataSheet1.PDF
-
-
--
22 Dec 2021
22 Dec 2021

IhybCNV: An intra-hybrid approach for CNV detection from next-generation sequencing data
Kun Xie ... Xiguo Yuan
Digital Signal Processing | VOL. 121
Kun Xie, et. al.Kun Xie ... Xiguo Yuan
09 Nov 2021
Digital Signal Processing | VOL. 121

CNV_MCD: Detection of copy number variations based on minimum covariance determinant using next-generation sequencing data
Yaoyao Li ... Kun Xie
Digital Signal Processing | VOL. 154
Yaoyao Li, et. al.Yaoyao Li ... Kun Xie
11 Jul 2024
Digital Signal Processing | VOL. 154

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Density Peak-Based Method to Detect Copy Number Variations From Next-Generation Sequencing Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Genetics