Abstract Background: In the field of antibody engineering, an essential task is to design a novel antibody whose paratopes bind to a specific antigen with correct epitopes. Understanding antibody structure and its paratope can facilitate a mechanistic understanding of its function. Therefore, antibody structure prediction from its sequence alone has always been a highly valuable problem for de novo antibody design. AlphaFold2 (AF2), a breakthrough in the field of structural biology, provides a solution to this protein structure prediction problem by learning a deep learning model. However, the computational efficiency and undesirable prediction accuracy on antibody, especially on the complementarity-determining regions limit its applications in de novo antibody design. Methods: To learn informative representation of antibodies, we trained a deep antibody language model (ALM) on curated sequences from observed antibody space database via a well-designed transformer model. We also developed a novel model named xTrimoABFold++ to predict antibody structure from antibody sequence only based on the pretrained ALM as well as efficient evoformers and structural modules. The model was trained end-to-end on the antibody structures in PDB by minimizing the ensemble loss of domain-specific focal loss on CDR and the frame aligned point loss. Results: xTrimoABFold++ outperforms AF2 and OmegaFold, HelixFold-Single with 30+% improvement on RMSD. Also, it is 151 times faster than AF2 and predicts antibody structure in atomic accuracy within 20 seconds. In recently released antibodies, for example, cemiplimab of PD-1 (PDB: 7WVM) and cross-neutralizing antibody 6D6 of SARS-CoV-2 (PDB: 7EAN), the RMSD of xTrimoABFold++ are 0.344 and 0.389 respectively. Conclusion: To the best of our knowledge, xTrimoABFold++ achieved the state-of-the-art in antibody structure prediction. Its improvement on both accuracy and efficiency makes it a valuable tool for de novo antibody design, and could make further improvement in immuno-theory. Experimental results on immune antibody dataset with 95% confidence interval. Method RMSD TMScore GDTTS GDTHA AlphaFold2 3.1254 ± 0.1410 0.8385 ± 0.0055 0.7948 ± 0.0057 0.6548 ± 0.0063 OmegaFold 3.2610 ± 0.1463 0.8384 ± 0.0057 0.7925 ± 0.0059 0.6586 ± 0.0063 HelixFold-Single 2.9648 ± 0.0997 0.8328 ± 0.0055 0.7805 ± 0.0057 0.6225 ± 0.0060 ESMFold 3.1549 ± 0.0067 0.8390 ± 0.0003 0.7952 ± 0.0003 0.6551 ± 0.0003 xTrimoABFold++ 1.9594 ± 0.0805 0.8986 ± 0.0052 0.8694 ± 0.0057 0.7456 ± 0.0066 Citation Format: Yining Wang, Xumeng Gong, Shaochuan Li, Bing Yang, Yiwu Sun, Yujie Luo, Hui Li, Le Song. Fast de novo antibody structure prediction with atomic accuracy. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 4296.
Read full abstract