In environmental monitoring systems based on the Internet of Things (IoT), sensor nodes (SNs) typically send data to the server via a wireless gateway (GW) at regular intervals. However, when SNs are located far from the GW, substantial energy is expended in transmitting data. This paper introduces a novel unmanned aerial vehicle (UAV)-based environmental monitoring system. In the proposed system, the UAV conducts patrols in the designated area, and SNs periodically transmit the collected data to the GW or the UAV. This transmission decision is made while taking into account the respective distance between both the GW and the UAV. To ensure a high-quality environmental map, characterized by a consistent collection of a satisfactory amount of up-to-date data while preventing energy depletion in the SNs and the UAV, the UAV periodically decides on three types of UAV operations. These decisions involve deciding where to move, deciding whether to relay or aggregate the data from the SNs, and deciding whether to transfer energy to the SNs. For the optimal decisions, we introduce an algorithm, called DeepUAV, using deep reinforcement learning (DRL) to make decisions in UAV operations. In DeepUAV, the controller continually learns online and enhances the UAV’s decisions through trial and error. The evaluation results indicate that DeepUAV successfully gathers a substantial amount of the current data consistently while mitigating the risk of energy depletion in SNs and the UAV.