AbstractThis letter introduces an innovative approach for minimizing energy consumption in multi‐unmanned aerial vehicles (multi‐UAV) networks using deep reinforcement learning, with a focus on optimizing the age of information (AoI) in disaster environments. A hierarchical UAV deployment strategy that facilitates cooperative trajectory planning, ensuring timely data collection and transmission while minimizing energy consumption is proposed. By formulating the inter‐UAV network path planning problem as a Markov decision process, a deep Q‐network (DQN) strategy is applied to enable real‐time decision making that accounts for dynamic environmental changes, obstacles, and UAV battery constraints. The extensive simulation results, conducted in both rural and urban scenarios, demonstrate the effectiveness of employing a memory access approach within the DQN framework, significantly reducing energy consumption up to 33.25% in rural settings and 74.20% in urban environments compared to non‐memory approaches. By integrating AoI considerations with energy‐efficient UAV control, this work offers a robust solution for maintaining fresh data in critical applications, such as disaster response, where ground‐based communication infrastructures are compromised. The use of replay memory approach, particularly the online history approach, proves crucial in adapting to changing conditions and optimizing UAV operations for both data freshness and energy consumption.