With the large-scale distributed generations (DGs) being connected to distribution network (DN), the traditional day-ahead reconfiguration methods based on physical models are challenged to maintain the robustness and avoid voltage off-limits. To address these problems, this paper develops a deep re-inforcement learning method for the sequential reconfiguration with soft open points (SOPs) based on real-time data. A state-based decision model is first proposed by constructing a Marko decision process-based reconfiguration and SOP joint optimization model so that the decisions can be achieved in milliseconds. Then, a deep reinforcement learning joint framework including branching double deep <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$Q$</tex> network (BDDQN) and multi-policy soft actor-critic (MPSAC) is proposed, which has significantly improved the learning efficiency of the decision model in multi-dimensional mixed-integer action space. And the influence of DG and load uncertainty on control results has been minimized by using the real-time status of the DN to make control decisions. The numerical simulations on the IEEE 34-bus and 123-bus systems demonstrate that the proposed method can effectively reduce the operation cost and solve the overvoltage problem caused by high ratio of photovoltaic (PV) integration.