NSF4SL: negative-sample-free contrastive learning for ranking synthetic lethal partner genes in human cancers.

Shike Wang,Yong Liu,Yimiao Feng,Jie Zheng,Xin Liu,Min Wu

doi:10.1093/bioinformatics/btac462

Abstract

Detecting synthetic lethality (SL) is a promising strategy for identifying anti-cancer drug targets. Targeting SL partners of a primary gene mutated in cancer is selectively lethal to cancer cells. Due to high cost of wet-lab experiments and availability of gold standard SL data, supervised machine learning for SL prediction has been popular. However, most of the methods are based on binary classification and thus limited by the lack of reliable negative data. Contrastive learning can train models without any negative sample and is thus promising for finding novel SLs. We propose NSF4SL, a negative-sample-free SL prediction model based on a contrastive learning framework. It captures the characteristics of positive SL samples by using two branches of neural networks that interact with each other to learn SL-related gene representations. Moreover, a feature-wise data augmentation strategy is used to mitigate the sparsity of SL data. NSF4SL significantly outperforms all baselines which require negative samples, even in challenging experimental settings. To the best of our knowledge, this is the first time that SL prediction is formulated as a gene ranking problem, which is more practical than the current formulation as binary classification. NSF4SL is the first contrastive learning method for SL prediction and its success points to a new direction of machine-learning methods for identifying novel SLs. Our source code is available at https://github.com/JieZheng-ShanghaiTech/NSF4SL. Supplementary data are available at Bioinformatics online.

Full Text