Combining Deep Neural Networks with Reinforcement Learning, known as Deep Reinforcement Learning (DRL), is revolutionizing fields like medicine, industry, and gaming. DRL has achieved groundbreaking results, particularly in complex Real-Time Strategy (RTS) games such as StarCraft II and Dota 2, serving as benchmarks for testing RL algorithms' robustness and safety.Despite these successes, DRL algorithms face challenges, including high computational costs and a lack of safety-aware approaches. Training these algorithms requires extensive computational resources, leading to a significant divide between algorithms developed on supercomputers and those feasible on standard hardware. This also raises sustainability concerns due to increased CO2 emissions. Additionally, most RL algorithms are risk-neutral, limiting their deployment in safety-critical systems.We present a novel model-based DRL approach, the Safe Observations Rewards Actions Costs Learning Ensemble (S-ORACLE), to address these challenges. S-ORACLE balances robust safety awareness with minimized risk and computational efficiency. Empirical validation across complex game environments—Deep RTS, ELF: MiniRTS, MicroRTS, Deep Warehouse, and StarCraft II—demonstrates that S-ORACLE outperforms state-of-the-art methods by significantly improving safety performance, reducing computational costs, and lowering environmental impact, while maintaining high efficiency and adaptability in training.