Abstract

Monoclonal antibodies (mAb) are biopharmaceutical products that improve human immunity. In this work, we propose a multi-actor proximal policy optimization-based reinforcement learning (RL) for the control of mAb production. Here, manipulated variable is flowrate and the control variable is mAb concentration. Based on root mean square error (RMSE) values and convergence performance, it has been observed that multi-actor PPO has performed better as compared to other RL algorithms. It is observed that PPO predicts a 40 % reduction in the number of days to reach the desired concentration. Moreover, the performance of PPO is improved as the number of actors increases. PPO agent shows the best performance with three actors, but on further increasing, its performance deteriorated. These results are verified based on three case studies, namely, (i) for nominal conditions, (ii) in the presence of noise in raw materials and measurements, and (iii) in the presence of stochastic disturbance in temperature and noise in measurements. The results indicate that the proposed approach outperforms the deep deterministic policy gradient (DDPG), twin delayed deep deterministic policy gradient (TD3), and proximal policy optimization (PPO) algorithms for the control of the bioreactor system.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call