Polymer-electrolyte aqueous two-phase systems (ATPS) have demonstrated their superior performance in the separation and purification of high-value biomolecules. However, these powerful platforms are still a major academic curiosity, without their acceptance and implementation by industry. One of the major obstacles is the absence of models to predict the partition of biomolecules in ATPS in an easy and predictive way. To address this limitation, modelling studies on the binodal curve behavior of polymer-electrolyte ATPS and the partitioning of biomolecules in these aqueous electrolyte solutions are carried out in this work. First, a comprehensive database targeting the studied systems is established. In total, 11,998 experimental binodal data points covering 276 polymer-electrolyte ATPS at different temperatures (273.15 K-399.15 K) and 626 experimental partition data points involving 22 biomolecules in 42 polymer-electrolyte ATPS at different temperatures (283.15 K-333.15 K) are included. Then, a novel modeling strategy that combines a well-known machine learning algorithm, i.e., artificial neural network (ANN) and group contribution (GC) method is proposed. Based on this modeling strategy, an ANN-GC model (ANN-GC model1) is built to describe the binodal curve behavior of polymer-electrolyte ATPS, while another ANN-GC model (ANN-GC model2) is developed to predict the partition of biomolecules in these biphasic systems. ANN-GC model1 gives a mean absolute error (MAE) of 0.0132 and squared correlation coefficient (R2) of 0.9878 for the 9,598 training data points, and for the 1,200 validation data points they are 0.0141 and 0.9858, respectively. Meanwhile, it also gives a MAE of 0.0143 and R2 of 0.9846 for the 1,200 test data points. On the other hand, ANN-GC model2 gives root-mean-square deviation (RMSD) of 0.0577 for 501 training data points, and for the 62 validation data points and 63 test data points their RMSD are 0.0849 and 0.0885, respectively. Furthermore, the obtained results also indicate that the tie-line length of polymer-electrolyte ATPS calculated from ANN-GC model1 can be directly used in ANN-GC model2 for predicting the partition performance coefficient of biomolecules in these ATPS. The developed models offer the possibility to predict the partition of biomolecules in ATPS without any requirement of experimental data. Based on the developed ANN-GC models, some high-performance ATPS are identified to partition four well-known biomolecules.
Read full abstract