Abstract Background Genetic alterations implicated in constitutional disorders range in size from single nucleotide substitutions to losses and gains of millions of bases. Whole exome sequencing (WES) captures many clinically relevant variants, but an analytic pipeline optimized for detection of small substitutions, insertions, and deletions may miss pathogenic copy number variants (CNVs). Historically, our laboratory performed separate cytogenetic tests to evaluate for larger genomic imbalances. Addition of copy number analysis to our WES pipeline could improve the assay’s diagnostic yield. This study describes part of our strategy to validate copy number analysis of our constitutional whole exome sequencing data. Methods We identified 18 CNV specimens that underwent both chromosomal microarray analysis (CMA) and WES between 2017 and 2023. Variants included 13 deletions (size range 14Kb-2Mb) and 5 duplications (size range 325Kb-10Mb). The standard method of constitutional copy number analysis in our clinical laboratory is CMA using the Affymetrix Cytoscan HD Array and analysis software. CNVs identified on CMA were reviewed with reference to the Database of Genomic Variants and multiple academic hospital databases. WES was performed on an Illumina sequencing system following capture-based library enrichment with the Integrated DNA Technologies xGen Exome Research, CNV Backbone, Human mtDNA Research, and Human ID Research Panel probes. CNVs identified on WES were called using the Broad Institute’s Genomic Analysis Toolkit (GATK) with a quality score cutoff of 120. Whole exome sequencing data was visualized on the Integrative Genomics Viewer. Results 14 out of 18 CNVs reported on CMA were also detected on WES with a quality score at or above 120 (78%). This includes 10 out of 13 deletions (77%) and 3 out of 5 duplications (60%). For both deletions and duplications, the size ranges of detected variants at or above the cutoff score and undetected variants overlapped. One duplication reported on CMA was detected on WES with a quality score below 120. Visualization of whole exome sequencing data showed variation in probe density that may have affected CNV calling. Conclusions The majority of CNVs reported after CMA (14/18, 78%) were also detected on WES data using a GATK quality score cutoff of 120. The data does not suggest an association between CNV size and detection by WES, although interpretation is limited by the small number of cases. Data visualization may be required to identify some CNVs. Future steps to characterize the limitations of copy number analysis of WES data include assessments of variance in probe density and read depth across the exome. Copy number analysis of WES data from cases with normal CMA results will also be performed for assay validation.
Read full abstract