This paper investigates the use of deep multi-agent reinforcement learning (MARL) for the coordination of residential energy flexibility. Particularly, we focus on achieving cooperation between homes in a way that is fully privacy-preserving, scalable, and that allows for the management of distribution network voltage constraints. Previous work demonstrated that MARL-based distributed control can be achieved with no sharing of personal data required during execution. However, previous cooperative MARL-based approaches impose an ever greater training computational burden as the size of the system increases, limiting scalability. Moreover, they do not manage their impact on distribution network constraints. We therefore adopt a deep multi-agent actor–critic method that uses a centralised but factored critic to rehearse coordination ahead of execution, such that homes can successfully cooperate at scale, with only first-order growth in computational time as the system size increases. Training times are thus 34 times shorter than with a previous state-of-the-art reinforcement learning approach without the factored critic for 30 homes. Moreover, experiments show that the cooperation of agents allows for a decrease of 47.2% in the likelihood of under-voltages. The results indicate that there is significant potential value for management of energy user bills, battery depreciation, and distribution network voltage management, with minimal information and communication infrastructure requirements, no interference with daily activities, and no sharing of personal data.