Abstract

MotivationCNValidator assesses the quality of somatic copy-number calls based on coherency of haplotypes across multiple samples from the same individual. It is applicable to any copy-number calling algorithm, which makes calls independently for each sample. This test is useful in assessing the accuracy of copy-number calls, as well as choosing among alternative copy-number algorithms or tuning parameter values.ResultsOn a dataset of somatic samples from individuals with Barrett’s Esophagus, CNValidator provided feedback on the correctness of sample ploidy calls and also detected data quality issues.Availability and implementationCNValidator is available on GitHub at https://github.com/kuhnerlab/CNValidator.Supplementary information Supplementary data are available at Bioinformatics online.

Highlights

  • Studies of somatic variation within organisms, in neoplasia and cancers, often require inference of somatic changes in copy number. Such inference can be based on SNP array data using programs such as ASCAT (Van Loo et al, 2010) or ABSOLUTE (Carter et al, 2012), or on sequencing data using programs such as ascatNgs (Raine et al, 2016)

  • We present the haplotype coherency test, which leverages information from multiple samples from the same patient to estimate the accuracy of inferred allele-specific copy-number calls

  • Our Barrett’s Esophagus (BE) results show the usefulness of CNValidator both in choosing among alternative copy-number approaches and in detecting failure of copy-number calling, in our case due to a quality-control issue

Read more

Summary

Introduction

Studies of somatic variation within organisms, in neoplasia and cancers, often require inference of somatic changes in copy number. We present the haplotype coherency test, which leverages information from multiple samples from the same patient to estimate the accuracy of inferred allele-specific copy-number calls. CNValidator requires B-allele frequencies from array or sequencing data (though we have practical experience only with array data), and segments and copy-number calls from a copy-number algorithm It uses simple input formats which can be prepared from a variety of file formats. Examination of individual calls showed many segments with inferred fractional copy number that would round to a balanced call with a low-ploidy baseline, but to an unbalanced call with a high-ploidy baseline; the coherency test strongly favored the unbalanced calls and the assignment of a high-ploidy baseline for this sample. Quality control checks showed that this patient had been run using the wrong normal control; when the analysis was repeated with the correct control, accuracies were over 98%

Discussion
Findings
Application
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.