Amidst the rapid progression of the road transport industry, the safety and efficiency of heavy-vehicle platoons have garnered significant attention. The study tackles the challenge of obstacle avoidance presented by vehicles owing to their considerable mass, delayed response times, and line-of-sight impediments, by introducing a cooperative obstacle avoidance system for heavy-vehicle platoons based on deep reinforcement learning. The system comprises three primary modules: perception, decision-making, and control. Initially, the perception module acquires real-time environmental data. Subsequently, the decision-making module formulates obstacle avoidance decisions based on the acquired data. Specifically, it implements a two-stage braking obstacle avoidance strategy under low collision risk scenarios, while employing a fifth-degree polynomial for planning and tracking obstacle avoidance paths under high collision risk conditions suitable for steering maneuvers. The control module utilizes the local multi-agent deep deterministic policy gradient (LADDPG) algorithm to train the heavy-vehicle platoon agents, ensuring the formation’s maintenance while mitigating collisions with other vehicles and obstacles. The effectiveness of the proposed system is substantiated through simulation experiments, demonstrating its adaptability to various traffic conditions, selection of suitable obstacle avoidance strategies, and significant enhancement of obstacle avoidance performance and heavy-vehicle platoon stability.