The ultra-dense network structure of 5G with dense femto-cells deployment, is identified as a prospective way out for the problem of growing demand for cellular services. However, optimum resource allocation among dense and random deployed femto-cell base stations (FBSs) is a challenging task since there are severe interferences. To overcome these interferences and to obtain efficient resource allocation, a two-stage cluster-based resource allocation scheme has been proposed. In the first step, an efficient dynamic clustering algorithm based on unsupervised learning is proposed which group the FBSs in an optimal number of clusters while balancing the traffic load in each cluster. Afterward, we focus on the green resource allocation problem by using a cooperative methodology. A cloud-based multi-agent reinforcement learning algorithm has been proposed in a decentralized way. The cloud server provides fast-flexible access, large storage space, and turns down operational expenditure. Further, the reinforcement model exploits a brief representation of Q-value to resolve the problem of large stateaction space, diminish the computational complexity and hasten the convergence. The proposed scheme is evaluated in real time experimental setup. The experimental results and the numerical results verify that the proposed scheme significantly improves energy efficiency with a QoS guarantee.