Recently, various types of Internet of Things (IoT) services have become wide spread, and new types of IoT devices are emerging. However, the significant number of high-rise buildings in urban environments renders it difficult to provide seamless and high-speed network connectivity consistently to these IoT devices via existing terrestrial IoT networks. Therefore, we herein consider unmanned aerial vehicle (UAV)-aided IoT networks, in which a ground base station (GBS) and a UAV base station (UBS) coexist. In this paper, a hierarchical multi-agent Q-learning (HiMAQ) framework is proposed to maximize the system throughput while minimizing device outage in UAV-aided IoT networks. The proposed HiMAQ adopts distributed multi-agent inner-loop reinforcement learning (RL) for determining the optimal transmission power of UBSs and GBSs, and distributed outer-loop RL to determine the optimal UBS deployment. This hierarchical RL architecture can reduce the computational complexity of the proposed RL approach compared with the centralized RL approach. The performance of HiMAQ in various network conditions is demonstrated considering the mobility of IoT devices and even/uneven spatial traffic distributions. Simulation results show that the proposed HiMAQ can maximize the system throughput and minimize the number of outage devices compared with various benchmark algorithms.