Sample efficiency is a limiting factor for existing distributed multi-agent reinforcement learning (MARL) algorithms over networked multi-agent systems. In this paper, the sample efficiency problem is tackled by formally incorporating the entropy regularization into the distributed MARL algorithm design. Firstly, a new entropy-regularized MARL problem is formulated under the model of networked multi-agent Markov decision processes with observation-based policies and homogeneous agents, where the policy parameter sharing among the agents provably preserves the optimality. Secondly, an on-policy distributed actor–critic algorithm is proposed, where each agent shares its parameters of both the critic and actor for consensus update. Then, the convergence analysis of the proposed algorithm is provided based on the stochastic approximation theory under the assumption of linear function approximation of the critic. Furthermore, a practical off-policy version of the proposed algorithm is developed which possesses scalability, data efficiency and learning stability. Finally, the proposed distributed algorithm is compared against the solid baselines including two classic centralized training algorithms in the multi-agent particle environment, whose learning performance is empirically demonstrated through extensive simulation experiments.