PurposeBispecific antibodies (BsAbs), capable of targeting two antigens simultaneously, represent a significant advancement by employing dual mechanisms of action for tumor suppression. However, how to pair targets to develop effective and safe bispecific drugs is a major challenge for pharmaceutical companies.MethodsUsing machine learning models, we refined the biological characteristics of currently approved or in clinical development BsAbs and analyzed hundreds of membrane proteins as bispecific targets to predict the likelihood of successful drug development for various target combinations. Moreover, to enhance the interpretability of prediction results in bispecific target combination, we combined machine learning models with Large Language Models (LLMs). Through a Retrieval-Augmented Generation (RAG) approach, we supplement each pair of bispecific targets’ machine learning prediction with important features and rationales, generating interpretable analytical reports.ResultsIn this study, the XGBoost model with pairwise learning was employed to predict the druggability of BsAbs. By analyzing extensive data on BsAbs and designing features from perspectives such as target activity, safety, cell type specificity, pathway mechanism, and gene embedding representation, our model is able to predict target combinations of BsAbs with high market potential. Specifically, we integrated XGBoost with the GPT model to discuss the efficacy of each bispecific target pair, thereby aiding the decision-making for drug developers.ConclusionThe novelty of this study lies in the integration of machine learning and GPT techniques to provide a novel framework for the design of BsAbs drugs. This holistic approach not only improves prediction accuracy, but also enhances the interpretability and innovativeness of drug design.
Read full abstract