Abstract Mutational processes can cause driver mutations which are a proximal cause of tumorigenesis. COSMIC identifies signatures of biological and environmental events that drive mutational processes and can be identified by examining patterns of somatic mutations. We describe the complexity of the COSMIC mutational signatures and their impact. We introduce Shannon entropy as a metric to measure complexity of trinucleotide distributions, thus capturing a previously ignored dimension of mutational signatures that may predict their identification. Shannon entropy derives from the rich predictive mathematics of information theory. We calculated entropy for all the COSMIC mutation signatures and found that SBS3, homologous recombination repair deficiency (HRD) signature, had highest entropy of 6.326, while SBS2 signature, activation of AID/APOBEC cytidine deaminases, had the lowest entropy of 1.758.We discovered that high entropy signatures (such as HRD SBS3) require more somatic mutations to be reliably detected than do low entropy signatures, important when planning signature detection using targeted panels that have smaller genomic footprint. We demonstrate this correlation with two approaches i) by simulation using MutSigSim, a tool we developed for simulating mutational profiles given the number of mutations, and ii) by using real datasets. Simulation was performed for high, medium, and low entropy signatures. We used cosine similarity of 0.7 as the threshold for detection. For signatures with entropy <2, a signature can be detected 90% of the time with 10 or more mutations. Similarly, signatures with entropies between 2-3, 3-5 and >5 can be detected 90% of the time with 30, 40, and 100 mutations, respectively. Mutation signatures of 1500 samples from a pan solid tumor cohort sequenced with Oncomine Comprehensive Assay Plus or Oncomine TMB were used to confirm these conclusions. We saw that signatures with higher entropy were difficult to detect at cosine similarity threshold of 0.7 in samples with a smaller number of mutations. In a cohort of samples with MSI-high, we detected mismatch repair (MMR) signatures with low to medium entropy with high specificity and good sensitivity. Furthermore, in a cohort of Ovarian cancer samples, it was difficult to detect the high entropy HRD signature at 0.7 threshold when the number of somatic mutations was not high. We show that ability to detect mutation signatures using targeted panels is directly correlated to the complexity of signature and number of mutations. For research use only. Not for use in diagnostic procedures. Citation Format: Ajithavalli Chellappan, Chintan Vora, Jagannath Patro, Shilpa Nair, Rushikesh S. Kanap, Fiona C. Hyland. Shannon entropy of mutational signatures predict sensitivity of signature detection in targeted sequencing [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 1515.
Read full abstract