In the era of 5G and beyond, Mobile Edge Computing (MEC) has emerged as a technology that seamlessly integrates wireless networks and the Internet, enabling low-latency and high-reliability computing services for mobile users. A crucial prerequisite for deploying MEC is the strategic selection of edge server locations that can satisfy computing demands and improve resource utilization. In this paper, we study the problem of efficient and intelligent dynamic edge server placement considering time-varying network states and placement costs. We present a long-term dynamic decision-making process that models edge server placement as a Markovian decision process and dynamically adjusts server layout. To achieve intelligent decision-making, we propose two deep reinforcement learning-based algorithms. Namely, the DBPA algorithm based on D3QN and the PBPA algorithm based on PPO, which significantly improve the efficiency and performance of model training. We also propose a novel method for transforming network states into network inputs using heat map and grayscale map to enhance the agent’s learning efficiency. Experimental results demonstrate that our proposed algorithms achieve intelligent and dynamic placement of edge servers, and outperform comparison algorithms by 13.20% to 61.84% and 23.09% to 66.32%, respectively.