Abstract Medical robots encounter challenges when interacting with people or operating in complex and dynamic environments due to the variability of human morphology and the unpredictability of environmental changes. Compliance of human-robot interaction is the primary goal of medical robots when in contact with the human body. Therefore, robots must be able to adaptively adjust their forces and actions to ensure safety and comfort during the contact process. This paper focuses on the compliance control of rehabilitation massage robots in dynamic scenes. We propose a mechanical arm compliance control method based on the Soft Actor-Critic (SAC) algorithm. We construct a simulated massage environment in a dynamic scene according to the task requirements and design a massage path covering the entire back. Under the framework of deep reinforcement learning, the optimal reward function is designed to achieve constant force control under dynamic scenes. Through numerous simulation experiments, we have verified that the robotic arm can move along the predetermined path under the massage while maintaining a constant contact force with the body simulation module. The actual contact force and target contact force control are realized within 0.1 N.