Abstract Background For decades double knockout (KO) perturbation screens were limited to model organisms such as S. pombe and S. cerevisiae. CRISPR technology has revolutionized genetic interaction discovery by allowing large scale screening in human cell lines, organoids, and mouse models. However, there remains much uncertainty regarding the optimal way to determine the presence of genetic interaction from the raw data generated from these large scale double perturbation experiments. Here we compare two different analysis methods run on the same normalized dataset to determine to what degree does the analysis method influence the determination of genetic interaction. Methods A publicly available genetic interaction dataset containing 24,908 double knock-out constructs across three cell lines (Hela, A549, 293T) in four time points (day 3, 14, 21, 28) and two replicates generated from a pair-wise CRISPR-Cas9 KO screen was used for analysis (Shen et al, Nature Methods, 2017). These data were used to measure single gene fitness scores for 73 known cancer driver genes and all 2628 pair-wise interactions using (1) the numerical Bayesian method from Shen et al, called CTG (Compositional and Time-course-aware Genetic analysis), and (2) the variational Bayesian method GEMINI (Zamanighomi et al, Genome Biology, 2019). Results Single gene KO fitness measurements from CTG and GEMINI were highly correlated for all three cell lines (pearson r 0.678, 0.604, 0.784 for HeLa, A549, and 293T, respectively; p< 0.1 x10-8 for each). In contrast, correlation of genetic interaction scores between the two methods was essentially random: HeLa r= -0.0143, p= 0.47, A549; A549 r= -0.0476, p=0.015; 293T r= -0.0135, p= 0.49. Of 52 synthetic lethal interactions identified by CTG in HeLa at z-score cut off -3, none were identified by GEMINI at same Z cutoff. Conversely of 4 interactions identified by GEMINI, none were identified by CTG. Similarly in A549, of 57 interactions identified by CTG none were identified by GEMINI, of 3 interactions identified by GEMINI none were identified by CTG. Restricting to genetic interactions that were validated in low-throughput drug-drug assays, of 5 synthetic lethal interactions found in HeLa by CTG (CHEK1-MAP2K1, CHEK1-TYMS, ADA-CHEK1, ATM-CHEK1, CDK9-CHEK1) all but CHEK1-TYMS were validated in low-throughput assays. However none of the 5 were scored as hits by GEMINI. Of 3 interactions scored as synthetic lethal in A549 (PRKDC-RRM2, CDK9-PRKDC, CDK4-PRKDC) all but PRKDC-RRM2 were validated, none of the 3 were scored as hits by GEMINI. Conclusions This study highlights dramatic differences in calculated genetic interaction scores from two different computational algorithms applied to the same experimental data. With only 8 of 2628 (0.3%) interactions tested in validation experiments it is not currently possible to know the ground truth in order to assess which method is most accurate. The generation of synthetic genetic interaction data will be an important step for further optimization of algorithms to detect genetic interaction. Citation Format: John Paul Shen, Yue Gu, Saikat Chowdhury. Determining genetic interaction from double knockout CRISPR screening [abstract]. In: Proceedings of the AACR Special Conference in Cancer Research: Expanding and Translating Cancer Synthetic Vulnerabilities; 2024 Jun 10-13; Montreal, Quebec, Canada. Philadelphia (PA): AACR; Mol Cancer Ther 2024;23(6 Suppl):Abstract nr A023.
Read full abstract