Abstract Introduction: Synthetic lethality (SL) of two genes is defined that a cell loses its viability if both genes lost their functions, but it remains viable if losing only one gene. Because of the emerging CRISPR-cas9 technology, a number of cancer cells were screened in gene combination double knockout (CDKO) experiments. These CDKO experiments have been effective in identifying SL gene pairs as potential combination targets in treating cancer. The existing SL database, SynLethDB v2.0, attempted to integrate SL experiments and data, but has several knowledge gaps: (1) missing all non-SL gene pairs; (2) inaccurate cell line annotations; (3) incomprehensive coverage of CDKO experiments; (4) deviation from originally reported SL scores, and most importantly (5) ignoring the difference among various SL calculation methods, including sequence mapping, sequence count processing, sample normalization, background control normalization, and SL score calculation. Methods: SL knowledge base (SLKB) is developed. It includes SL data from 11 published CDKO experiments in 22 cell lines. Between SLKB and SynLethDB, their cell line data and positive/negative SL gene pair data collected from CDKO experiments were compared. Five SL calculation methods are compared: median SL score with and without background control normalization (Median-B/NB), similarly sgRNA-Derived B/NB, Horlbeck-Score, GEMINI, and MAGeCK-based SL score (MAGeCK-Score). We compared their overall overlapping and pairwise overlapping among top 10% SL gene pairs. Results: - For human SL, SynLethDB only included two CDKO experiments, 189 SL gene pairs, and 0 non-SL gene pairs. Our SLKB has 11 CDKO experiments, 22 cell lines, 16,044 SL gene pairs, and 264,444 non-SL gene pairs.. - Among top 10% SL gene pairs from five SL calculation methods, only 0.05% are overlapping. - In pairwise overlapping analysis, Median-B/NB Score has the highest averaging pairwise overlapping with the other four methods, 26.82%. GEMINI has the lowest average pairwise overlapping with the other methods, 5.11%. The averaging overlapping rates for MAGeCK-Score, sgRNA-Derived B/NB Score, and Horlbeck-Score are 24.52%, 18.50%, and 9.89% respectively. Conclusion: SLKB is a much more comprehensive SL database than SynLethDB in CDKO experiment data integration and includes original scorings. There is a significant amount of difference between five SL calculation methods and their SL scores. All five SL scores can be queried and ranked in SLKB. SLKB also documents sufficient technical details of these five methods, allowing users to review and choose proper methods for their data analysis and presentation. Citation Format: Birkan Gökbağ, Shan Tang, Kunjie Fan, Lijun Cheng, Lang Li. SLKB: Synthetic lethality knowledge base for gene combination double knockout experiments. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 6581.
Read full abstract