Introduction A common source of tumorigenesis in several cancer types such as breast and ovarian is a defect in the homologous recombination (HR) machinery. HR defect can be detected by identifying single nucleotide variants (SNVs), deletions, duplications, and other errors generated by the alternative double-strand break repair machineries. However, current diagnostic tools for HR defect relies on copy number changes with reduced sensitivity and specificity, or on the existence of BRCA mutations. While, recent efforts to detect HR defect using mutational signature require a large number of SNVs, beyond what is typically obtained from panel sequencing which is widely used in clinics. Material and methods We developed a computational software called SigMA to identify signatures of HR defect even from low SNV counts. SigMA carries out a multivariate analysis to isolate the effect of a single biological process in the presence of multiple mutagenic processes. The multivariate analysis includes likelihood estimations, which we suggest as novel and sensitive measures, together with commonly used measures such as cosine similarity, and signature amplitudes. First, we determine the signature composition of SNVs in 720 whole-genome sequenced (WGS) breast cancer samples, with non-negative matrix factorization (NMF). Then, a subset of mutations from each sample is selected based on whether they fall within the coverage of the sequencing technique of interest. We optimise our algorithm using the simulated data and considering NMF results as a baseline measurement, and we validate its performance on panel data of 890 breast tumours. Results and discussions Whereas previous attempts in detection of mutational signatures from panel data were limited to cases with a high mutation burden, SigMA can be applied to a large fraction of samples with panel data. On a panel consisting of 410 genes, we were able to dramatically improve the detection of mutational signature of HR defect, with 0.75 (0.5) sensitivity and 0.1 (0.01) false positive rate. SigMA detected 213 cases with HR defect in 890 breast cancer panels. For exome and whole-genome data, the performance of the tool is further improved. Conclusion We developed a tool which extends the applicability of mutational signatures analysis to low counts of mutations. Application of our SigMA algorithm to detect HR defect and other mutational signatures in panel sequencing data will increase the number of cases that may benefit therapeutic agents that target specific classes of genomic alterations.
Read full abstract