In the electric power generation sector, striking a balance between maximum power production and acceptable emission limits is a challenging task that requires sophisticated techniques. With traditional methods, this is an extremely complex issue due to the large number of process variables that are involved. In this paper, a deep reinforcement learning optimization framework (DRLOF) is proposed to determine the optimal operating conditions for a commercial circulating fluidized bed (CFB) power plant that strikes a good balance between performance and environmental issues. The DRLOF included the CFB as an environment created from over 1.5 years of plant data with a 1 min sampling time which interacted with an advantage actor-critic (A2C) agent of two architectures named ‘separate-A2CN’ and ‘shared-A2CN’. The framework was optimized by maximizing electric power generation within the constraints of the plant’s capacity and environmental emission standards, taking into consideration the cost of operations. After training, the framework of the separate-A2CN architecture achieved a 1.97% increase in electricity generation and 1.59% emission reduction for NOx at 14.3 times lower computational cost. Furthermore, we demonstrated the framework’s flexibility, adaptability and lower computational burden by carrying out different test scenarios which demonstrated the effectiveness of the DRLOF. The findings of this study are not limited to the CFB power plant but can be extended to other chemical processes and industries. This approach minimizes the need for costly experiments, online optimization challenges and associated customizations. • Multi-objective optimization of CFB power plant with deep reinforcement learning. • Objective formulation considered power, fuel, reagent and environmental standards. • Two main reinforcement learning architectures were analysed for better CFB results. • Improved performance of CFB by 1.97% power increase and 1.59% emission reduction. • The framework’s generality, adaptability and computational efficiency were tested.