This study investigates the light olefin separation using dual dividing wall columns (DWCs) and compares various proportional-integral (PI) controller tuning methods. A novel process-conscious reinforcement learning (RL) approach, utilizing Deep Deterministic Policy Gradient (DDPG), is developed by incorporating domain-specific knowledge into the reward function to ensure compliance with critical process variables and product purity. The DDPG-based PI tuning is evaluated against traditional methods such as Aspen’s recommended initial tuning, Ziegler-Nichols (ZN), and Cohen-Coon (CC). The result shows that the DDPG method significantly enhances control stability and accuracy, reducing error by factors of 11.9, 2.3, and 1.6 compared to Aspen, ZN, and CC, respectively. Additionally, DDPG demonstrates superior energy efficiency, consuming 13% less energy than the next best method. It also reduces purification costs by up to 13.5% and CO2 emissions by 20% compared to other methods. This study highlights the potential of integrating advanced RL techniques into industrial process control, delivering substantial improvements in stability, energy efficiency, economic, and environmental performance.