Now more than ever, optimizing resource allocation in cloud computing is becoming more critical due to the growth of cloud computing consumers and meeting the computing demands of modern technology. Cloud infrastructures typically consist of heterogeneous servers, hosting multiple virtual machines with potentially different specifications, and volatile resource usage. This makes the resource allocation face many issues such as energy conservation, fault tolerance, workload balancing, etc. Finding a comprehensive solution that considers all these issues is one of the essential concerns of cloud service providers. This paper presents a new resource allocation model based on an intelligent multi-agent system and reinforcement learning method (IMARM). It combines the multi-agent characteristics and the Q-learning process to improve the performance of cloud resource allocation. IMARM uses the properties of multi-agent systems to dynamically allocate and release resources, thus responding well to changing consumer demands. Meanwhile, the reinforcement learning policy makes virtual machines move to the best state according to the current state environment. Also, we study the impact of IMARM on execution time. The experimental results showed that our proposed solution performs better than other comparable algorithms regarding energy consumption and fault tolerance, with reasonable load balancing and respectful execution time.