Abstract

To facilitate reference-material selection for clinical genetic testing laboratories, we developed VarCover, open-source software hosted on GitHub, which accepts a file of variants and returns an approximately minimum set (min-set) of samples covering the targeted alleles. VarCover employs the SetCoverPy package, sample weights, and preselection of singleton-possessing samples to efficiently solve the min-set cover problem. As a test case, we attempted to find a min-set of reference samples from the 1000 Genomes Project to cover 237 variants considered putatively pathogenic (of which 12 were classified as pathogenic or likely pathogenic) in the original 56 medically actionable genes recommended by the American College of Medical Genetics and Genomics (ACMG). The number of samples, number of alleles, and processing time were measured in subsets of the 237 targetalleles. VarCover identified 140 reference-material samples from the 1000 Genomes Project covering the 237 alleles in the 56 ACMG-recommended genes. Sample weights derived from the minor allele frequency spectrum increased the number of alleles in the solution set. Preselection of samples that possessed singleton targetalleles reduced computational processing time when the target set size exceeded 100 alleles. VarCover provides a simple programmatic interface for identifying an approximately min-set of reference samples, thereby reducing clinical laboratory effort and molecular genetic test-validation costs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call