With the modern advanced information and communication technologies in smart grid systems, demand response (DR) has become an effective method for improving grid reliability and reducing energy costs due to the ability to react quickly to supply-demand mismatches by adjusting flexible loads on the demand side. This paper proposes a dynamic pricing DR algorithm for energy management in a hierarchical electricity market that considers both service provider’s (SP) profit and customers’ (CUs) costs. Reinforcement learning (RL) is used to illustrate the hierarchical decision-making framework, in which the dynamic pricing problem is formulated as a discrete finite Markov decision process (MDP), and Q-learning is adopted to solve this decision-making problem. Using RL, the SP can adaptively decide the retail electricity price during the on-line learning process where the uncertainty of CUs’ load demand profiles and the flexibility of wholesale electricity prices are addressed. Simulation results show that this proposed DR algorithm, can promote SP profitability, reduce energy costs for CUs, balance energy supply and demand in the electricity market, and improve the reliability of electric power systems, which can be regarded as a win-win strategy for both SP and CUs.