Solar Tower (ST) systems use heliostats to concentrate solar radiation onto a tower-mounted receiver. Optimizing the aiming strategy for these heliostats over the receiver remains a critical challenge due to the dynamic nature of solar radiation and the need to maximize energy capture while ensuring operational safety. This paper introduces a novel, model-free deep Reinforcement Learning (RL) approach to optimize heliostat aiming strategies, utilizing the Soft Actor–Critic (SAC) algorithm. This advanced RL method enhances the traditional Actor–Critic framework with two neural networks. The proposal dynamically adjusts the aiming points across the receiver surface in real time, trying to improve the overall performance of the ST plant. The strategy was simulated and evaluated over a full operational year and compared with traditional methods. The results show an increase of more than 8.8% in yearly absorbed power, a significant improvement that directly enhances performance and contributes to better economic outcomes for the technology. This technique also eliminates the need for constant human intervention and is applicable to both existing and future plants.