Antibody thermostability is challenging to predict from sequence and/or structure. This difficulty is likely due to the absence of direct entropic information. Herein, we present AbMelt where we model the inherent flexibility of homologous antibody structures using molecular dynamics simulations at three temperatures and learn the relevant descriptors to predict the temperatures of aggregation (Tagg), melt onset (Tm,on), and melt (Tm). We observed that the radius of gyration deviation of the complementarity determining regions at 400 K is the highest Pearson correlated descriptor with aggregation temperature (rp = −0.68 ± 0.23) and the deviation of internal molecular contacts at 350 K is the highest correlated descriptor with both Tm,on (rp = −0.74 ± 0.04) as well as Tm (rp = −0.69 ± 0.03). Moreover, after descriptor selection and machine learning regression, we predict on a held-out test set containing both internal and public data and achieve robust performance for all endpoints compared with baseline models (Tagg R2 = 0.57 ± 0.11, Tm,on R2 = 0.56 ± 0.01, and Tm R2 = 0.60 ± 0.06). In addition, the robustness of the AbMelt molecular dynamics methodology is demonstrated by only training on <5% of the data and outperforming more traditional machine learning models trained on the entire data set of more than 500 internal antibodies. Users can predict thermostability measurements for antibody variable fragments by collecting descriptors and using AbMelt, which has been made available.
Read full abstract