GPT4Kinase: High-accuracy prediction of inhibitor-kinase binding affinity utilizing large language model

Kaifeng Liu,Xiangyu Yu,Huizi Cui,Wannan Li,Weiwei Han

doi:10.1016/j.ijbiomac.2024.137069

Abstract

The accurate prediction of inhibitor-kinase binding affinity is crucial in biological research and medical applications. Particularly, kinases play a pivotal role in numerous cellular processes and are essential enzymes in Mitogen-Activated Protein Kinase (MAPK) signaling pathway. This present study harnesses the capabilities of Large Language Models (LLMs), specifically GPT-4, to predict the binding affinity between inhibitors and kinases within the MAPK pathway, including Raf protein kinase (RAF), Mitogen-activated protein kinase kinase (MEK) and Extracellular Signal-Regulated Kinase (ERK). Remarkably, GPT-4 achieved an impressive 87.31 % accuracy in prediction on RAF binding affinity, and 77.00 % accuracy in comprehensive prediction tasks, substantially outperforming existing mainstream methods such as Autodock Vina (21.21 %), BatchDTA (52.00 %) and KIPP (59.60 %). Furthermore, GPT-4 was employed to delineate the features of high-affinity and low-affinity molecules, as well as their contributing functional groups. These contributing groups were subsequently validated through molecular docking. Additionally, to validate the generalizability of the method, we applied it to six other kinases and achieved a maximum accuracy of 83.78 %. Also, we utilized a dataset comprising over 200 kinases, obtaining a high accuracy of 66.20 %. The study showcases the transformative impact of LLMs on molecular binding affinity prediction, with major implications for biological sciences and therapeutic development.

Full Text