AbstractThis paper presents a novel fully decentralized and intelligent energy management system (EMS) for a smart microgrid based on reinforcement learning (RL) strategy. The purpose of the proposed EMS is to maximize the benefit of all microgrid entities comprising customers and distributed energy resources (DERs). Due to unpredictable features of renewable energy sources and variability of consumers’ demands, designing the microgrid EMS is a complicated task. To overcome this issue, the multi‐agent hour‐ahead energy management problem is modelled as a finite Markov decision process. The microgrid entities are considered as intelligent agents. The optimal policy of agents is obtained through a newly developed framework of the model‐free Q‐learning algorithm to maximize the benefit of all renewable and non‐renewable energy resources and battery energy storage system. The degradation model of the battery is considered to reduce the number of battery replacements. To ensure customers’ comfort, customers’ expenses are decreased without demand curtailment via introducing two types of load shifting techniques. The microgrid operation is analysed under four scenarios comprising no‐learning, generator‐learning, customer‐learning, and whole‐learning. the performance of the proposed algorithm is compared to the Monte Carlo method and simulation results on the real power‐grid dataset show the superiority of the algorithm.