Parametric modeling of whole-genome sequencing data for CNV identification

S Vardhanabhuti,H Li,X J Jeng,Y Wu

doi:10.1093/biostatistics/kxt060

Abstract

Copy number variants (CNVs) constitute an important class of genetic variants in human genome and are shown to be associated with complex diseases. Whole-genome sequencing provides an unbiased way of identifying all the CNVs that an individual carries. In this paper, we consider parametric modeling of the read depth (RD) data from whole-genome sequencing with the aim of identifying the CNVs, including both Poisson and negative-binomial modeling of such count data. We propose a unified approach of using a mean-matching variance stabilizing transformation to turn the relatively complicated problem of sparse segment identification for count data into a sparse segment identification problem for a sequence of Gaussian data. We apply the optimal sparse segment identification procedure to the transformed data in order to identify the CNV segments. This provides a computationally efficient approach for RD-based CNV identification. Simulation results show that this approach often results in a small number of false identifications of the CNVs and has similar or better performances in identifying the true CNVs when compared with other RD-based approaches. We demonstrate the methods using the trio data from the 1000 Genomes Project.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Parametric modeling of whole-genome sequencing data for CNV identification

Abstract

Talk to us

Similar Papers

More From: Biostatistics

Lead the way for us

Journal: Biostatistics	Publication Date: Jan 28, 2014
Citations: 6

Similar Papers

Short Read (Next-Generation) Sequencing
Jaya Punetha ... Eric P Hoffman
Circulation: Cardiovascular Genetics | VOL. 6
Jaya Punetha, et. al.Jaya Punetha ... Eric P Hoffman
14 Jul 2013
Circulation: Cardiovascular Genetics | VOL. 6

The variable somatic genome
Maeve O’Huallachain ... Michael P Snyder
Cell Cycle | VOL. 12
Maeve O’Huallachain, et. al.Maeve O’Huallachain ... Michael P Snyder
19 Dec 2012
Cell Cycle | VOL. 12

Genome-wide Transcriptome Profiling Reveals the Functional Impact of Rare De Novo and Recurrent CNVs in Autism Spectrum Disorders
Rui Luo ... Daniel H Geschwind
The American Journal of Human Genetics | VOL. 91
Rui Luo, et. al.Rui Luo ... Daniel H Geschwind
21 Jun 2012
The American Journal of Human Genetics | VOL. 91

Copy number variation in human genomes from three major ethno-linguistic groups in Africa
Oscar A Nyangiri ... Enock Matovu
BMC Genomics | VOL. 21
Oscar A Nyangiri, et. al.Oscar A Nyangiri ... Enock Matovu
10 Apr 2020
BMC Genomics | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Parametric modeling of whole-genome sequencing data for CNV identification

Abstract

Talk to us

Similar Papers

More From: Biostatistics