Abstract
Abstract Introduction: Genome-wide CRISPR-Cas9 based loss-of-function screens can be used to find essential genes for proliferation and survival of cancer cells. While recent studies have focused on establishing reference sets of essential and non-essential genes, correcting copy number effect and characterizing off-target effect, it lacks in-depth studies of the effects of gene abundance and sgRNAs that targeting multi-genomic loci. To fill this gap timely, we here present a bioinformatics workflow to reduce false positives in CRISPR-Cas9 screens. Description: Gastric adenocarcinoma cell line AGS was infected with CRISPR knockout library (TKOv3) at a multiplicity of infection of 0.3~0.4. We used the cells right after puromycin selection as the baseline sample, and the cells cultured for 14 days or 20 days as the negative selection samples. The sgRNA inserts were amplified by PCR and the corresponding libraries were sequenced on NextSeq 500 with a single-end 75 bp run, followed by analysis by MAGeCK. The read counts of sgRNAs were normalized by non-essential genes to reduce false positives. The RNA-seq data and copy number data were obtained by CCLE portal. To characterize sgRNAs targeting multiple-genomic loci, Bowtie was used to align sgRNA to the reference human genome (GRCh38) with no mismatch, and only the alignments followed by NGG PAM site were remained for downstream analysis. Summary: Integration of RNA-seq data with CRISPR negative screen results showed that the selection signal was noisy for the lowly expressed genes. The fraction of selected essential genes (overall FDR<0.05, absolute value of beta score >1) was as low as 0.11% among the genes with the bottom 10% expression level, while 27% among the genes with the top 10% expression level. After filtering out the lowly expressed genes (<0.06 RPKM), the selected essential genes had an FDR much closer to 0. Out of the 40 essential genes selected without filtering out lowly expressed genes, none of them was reported oncogenes in literature. To study the influences of multiple alignments of sgRNAs, we only considered the ones with perfect alignments (i.e., no mismatch) so that we can prevent it from being confounding with off-target effects caused by mismatch tolerance. Log fold changes in read counts were calculated for each sgRNA between a later time point (day 14 or 20) vs. baseline (day 0). The median log fold change significantly decreased as a function of the number of perfect alignments (p = 0.0001, Jonckheere trend test). This supports the hypothesis that a sgRNA aligned to several DNA targets will introduce multiple double stranded cuts, and thus will result in biased essentiality scores. Conclusions: Filtering out lowly-expressed genes prior to CRISPR screen data analysis can reduce false positives. In addition, multiple-target sgRNAs can lead to false positives but the effect needs further analysis in a case by case manner. Citation Format: Yue Zhao, Xue Wu, Yuru Wang, Kin Fai Au, Lijun Cheng, Lang Li. New bioinformatics workflow of genome-wide CRISPR-Cas9 knockout screens [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 830.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.