Biological relevance of CNV calling methods using familial relatedness including monozygotic twins.

Christina A Castellani,Zain Awamleh,Andrea E Wishart,M Elizabeth O Locke,Richard L O’Reilly,Melkaye G Melka,Shiva M Singh

doi:10.1186/1471-2105-15-114

Christina A Castellani, Zain Awamleh + Show 5 more

Open Access

PDF Available

https://doi.org/10.1186/1471-2105-15-114

Copy DOI

Export

Save

Cite

Journal: BMC Bioinformatics	Publication Date: Apr 21, 2014
Citations: 20	License type: cc-by

Affiliation: Western University

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundStudies involving the analysis of structural variation including Copy Number Variation (CNV) have recently exploded in the literature. Furthermore, CNVs have been associated with a number of complex diseases and neurodevelopmental disorders. Common methods for CNV detection use SNP, CNV, or CGH arrays, where the signal intensities of consecutive probes are used to define the number of copies associated with a given genomic region. These practices pose a number of challenges that interfere with the ability of available methods to accurately call CNVs. It has, therefore, become necessary to develop experimental protocols to test the reliability of CNV calling methods from microarray data so that researchers can properly discriminate biologically relevant data from noise.ResultsWe have developed a workflow for the integration of data from multiple CNV calling algorithms using the same array results. It uses four CNV calling programs: PennCNV (PC), Affymetrix® Genotyping Console™ (AGC), Partek® Genomics Suite™ (PGS) and Golden Helix SVS™ (GH) to analyze CEL files from the Affymetrix® Human SNP 6.0 Array™. To assess the relative suitability of each program, we used individuals of known genetic relationships. We found significant differences in CNV calls obtained by different CNV calling programs.ConclusionsAlthough the programs showed variable patterns of CNVs in the same individuals, their distribution in individuals of different degrees of genetic relatedness has allowed us to offer two suggestions. The first involves the use of multiple algorithms for the detection of the largest possible number of CNVs, and the second suggests the use of PennCNV over all other methods when the use of only one software program is desirable.

Highlights

Studies involving the analysis of structural variation including Copy Number Variation (CNV) have recently exploded in the literature
We argue that a combination of three programs (Affymetrix® Genotyping ConsoleTM, Partek, and PennCNV) may be optimal to identify biologically relevant CNV calls due to their ability to resolve copy number variations across different biological relatedness
The results show that the number of raw CNVs identified in each individual varies depending on the program used

Summary

Introduction

Studies involving the analysis of structural variation including Copy Number Variation (CNV) have recently exploded in the literature. Common methods for CNV detection use SNP, CNV, or CGH arrays, where the signal intensities of consecutive probes are used to define the number of copies associated with a given genomic region. These practices pose a number of challenges that interfere with the ability of available methods to accurately call CNVs. These practices pose a number of challenges that interfere with the ability of available methods to accurately call CNVs It has, become necessary to develop experimental protocols to test the reliability of CNV calling methods from microarray data so that researchers can properly discriminate biologically relevant data from noise. Their attempt to confirm putative CNV calls using qPCR produced differing results; 38.3% of CNVs called by a single algorithm, 57.6% of CNVs called by two algorithms and 71.4% of CNVs called by three algorithms could be confirmed by qPCR [16]

Objectives

Methods

Results

Conclusion