Process control of mAb production using multi-actor proximal policy optimization

Nikita Gupta,Shikhar Anand,Tanuja Joshi,Deepak Kumar,Manojkumar Ramteke,Hariprasad Kodamana

doi:10.1016/j.dche.2023.100108

Nikita Gupta, Shikhar Anand + Show 4 more

Open Access

https://doi.org/10.1016/j.dche.2023.100108

Copy DOI

Abstract

Monoclonal antibodies (mAb) are biopharmaceutical products that improve human immunity. In this work, we propose a multi-actor proximal policy optimization-based reinforcement learning (RL) for the control of mAb production. Here, manipulated variable is flowrate and the control variable is mAb concentration. Based on root mean square error (RMSE) values and convergence performance, it has been observed that multi-actor PPO has performed better as compared to other RL algorithms. It is observed that PPO predicts a 40 % reduction in the number of days to reach the desired concentration. Moreover, the performance of PPO is improved as the number of actors increases. PPO agent shows the best performance with three actors, but on further increasing, its performance deteriorated. These results are verified based on three case studies, namely, (i) for nominal conditions, (ii) in the presence of noise in raw materials and measurements, and (iii) in the presence of stochastic disturbance in temperature and noise in measurements. The results indicate that the proposed approach outperforms the deep deterministic policy gradient (DDPG), twin delayed deep deterministic policy gradient (TD3), and proximal policy optimization (PPO) algorithms for the control of the bioreactor system.

Full Text