Abstract

Monoclonal antibodies (mAb) are biopharmaceutical products that improve human immunity. In this work, we propose a multi-actor proximal policy optimization-based reinforcement learning (RL) for the control of mAb production. Here, manipulated variable is flowrate and the control variable is mAb concentration. Based on root mean square error (RMSE) values and convergence performance, it has been observed that multi-actor PPO has performed better as compared to other RL algorithms. It is observed that PPO predicts a 40 % reduction in the number of days to reach the desired concentration. Moreover, the performance of PPO is improved as the number of actors increases. PPO agent shows the best performance with three actors, but on further increasing, its performance deteriorated. These results are verified based on three case studies, namely, (i) for nominal conditions, (ii) in the presence of noise in raw materials and measurements, and (iii) in the presence of stochastic disturbance in temperature and noise in measurements. The results indicate that the proposed approach outperforms the deep deterministic policy gradient (DDPG), twin delayed deep deterministic policy gradient (TD3), and proximal policy optimization (PPO) algorithms for the control of the bioreactor system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.