When wireless communication networks encounter jamming attacks, they experience spectrum resource occupation and data communication failures. In order to address this issue, an anti-jamming algorithm based on distributed multi-agent reinforcement learning is proposed. Each terminal observes the spectrum state of the environment and takes it as an input. The algorithm then employs Q-learning, along with the primary and backup channel allocation rules, to finalize the selection of the communication channel. The proposed algorithm designs primary and backup channel allocation rules for sweep jamming and smart jamming strategies. It can predict the behavior of jammers while reducing decision conflicts among terminals. The simulation results demonstrate that, in comparison to existing methods, the proposed algorithm not only enhances data transmission success rates across multiple scenarios but also exhibits superior operational efficiency when confronted with jamming attacks. Overall, the anti-jamming performance of the proposed algorithm outperforms the comparison methods.