CNV Workshop: an integrated platform for high-throughput copy number variation discovery and clinical diagnostics

Xiaowu Gai,Juan C Perin,Kevin Murphy,Hongbo M Xie,Peter S White,Adam Wenocur,Ryan O'Hara,Tamim H Shaikh,Eric F Rappaport,Monica D'Arcy

doi:10.1186/1471-2105-11-74

Abstract

BackgroundRecent studies have shown that copy number variations (CNVs) are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. The increasing availability of high-resolution genome surveillance platforms provides opportunity for rapidly assessing research and clinical samples for CNV content, as well as for determining the potential pathogenicity of identified variants. However, few informatics tools for accurate and efficient CNV detection and assessment currently exist.ResultsWe developed a suite of software tools and resources (CNV Workshop) for automated, genome-wide CNV detection from a variety of SNP array platforms. CNV Workshop includes three major components: detection, annotation, and presentation of structural variants from genome array data. CNV detection utilizes a robust and genotype-specific extension of the Circular Binary Segmentation algorithm, and the use of additional detection algorithms is supported. Predicted CNVs are captured in a MySQL database that supports cohort-based projects and incorporates a secure user authentication layer and user/admin roles. To assist with determination of pathogenicity, detected CNVs are also annotated automatically for gene content, known disease loci, and gene-based literature references. Results are easily queried, sorted, filtered, and visualized via a web-based presentation layer that includes a GBrowse-based graphical representation of CNV content and relevant public data, integration with the UCSC Genome Browser, and tabular displays of genomic attributes for each CNV.ConclusionsTo our knowledge, CNV Workshop represents the first cohesive and convenient platform for detection, annotation, and assessment of the biological and clinical significance of structural variants. CNV Workshop has been successfully utilized for assessment of genomic variation in healthy individuals and disease cohorts and is an ideal platform for coordinating multiple associated projects.Availability and ImplementationAvailable on the web at: http://sourceforge.net/projects/cnv

Highlights

Recent studies have shown that copy number variations (CNVs) are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases
We describe here our implementation and modifications to Circular Binary Segmentation (CBS) first for the Illumina Infinium array platform and modifications required for use with Affymetrix, other SNP, and array-based comparative genomic hybridization (aCGH) arrays
We have previously reported CNV Workshop threshold values for calling germline heterozygous deletions, homozygous deletions, and duplications from Illumina 550 K data that we found effective for a wide range of samples and genotype quality scores [12,13]

Summary

Introduction

Recent studies have shown that copy number variations (CNVs) are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. CNVs are widely distributed in the genomes of apparently healthy individuals and constitute significant amounts of population-based genomic variation [3,4,5,6,7,8]. New genotyping technologies such as SNP-based arrays provide highresolution coverage of entire genomes as well as an opportunity for rapidly determining CNV content in sample collections of interest [4,6,7,9,10,11]. Interpretation of the exact extent, character, distribution, and effect of these CNVs has been limited by the emerging nature of computational methods for accurate detection, and further challenged by the difficulty in assessing the biological importance of particular CNVs in context with other genomic features and study findings

Objectives

Results

Discussion

Conclusion