Power scheduling is an NP-hard optimization problem that demands a delicate equilibrium between economic costs and environmental emissions. In response to the growing concern for climate change, global environmental policies prioritize decarbonizing the electricity sector by integrating renewable energies (REs) into power grids. While this integration brings economic and environmental benefits, the intermittency of REs amplifies the uncertainty and complexity of power scheduling. Existing optimization approaches often grapple with a limited number of units, overlook critical parameters, and disregard the intermittency of REs. To address these limitations, this article introduces a robust and scalable optimization algorithm for renewable integrated power scheduling based on reinforcement learning (RL). In this proposed methodology, the power scheduling problem is decomposed into Markov decision processes (MDPs) within a multi-agent simulation environment. The simulated MDPs are used to train a deep reinforcement learning (DRL) model for solving the optimization. The validity and effectiveness of the proposed method are validated across various test systems, encompassing single-to tri-objective problems with 10–100 generating units. The findings consistently demonstrate the superior performance of the proposed DRL algorithm compared to existing methods, such as multi-agent immune system-based evolutionary priority list (MAI-EPL), binary real-coded genetic algorithm (BRCGA), teaching learning-based optimization (TLBO), quasi-oppositional teaching learning-based algorithm (QOTLBO), hybrid genetic-imperialist competitive algorithm (HGICA), three-stage priority list (TSPL), real-coded grey wolf optimization (RCGWO), multi-objective evolutionary algorithm based on decomposition (MOEAD), and non-dominated sorting algorithms (NSGA-II and NSGA-III). Regarding the experimental results, it is important to highlight the importance of integrating RESs into larger power systems. In a 10-unit system with 2.81 % RE penetration, reductions of 3.42 %, 4.03 %, and 3.10 % were observed in costs, CO2 emissions, and SO2 emissions, respectively. Similarly, in a 100-unit system with a RE penetration rate of only 0.28 %, reductions of 3.75 % in cost, 4.42 % in CO2, and 3.34 % in SO2 were observed. These findings emphasize the effectiveness of RES integration, even at lower penetration rates, in larger-scale power systems.