In this paper, the convergence of the Nash-Q-Learning algorithm will be studied mainly. In the previous proof of convergence, each stage of the game must have a global optimal point or a saddle point. Obviously, the assumption is so strict that there are not many application scenarios for the algorithm. At the same time, the algorithm can also get a convergent result in the two Grid-World Games, which do not meet the above assumptions. Thus, previous researchers proposed that the assumptions may be appropriately relaxed. However, a rigorous theoretical proof is not given. The convergence point is a fractal attractor from the view of Fractals, general proof of convergence of the Nash-Q-Learning algorithm will be shown by the mathematical method. Meanwhile, some discussions on the efficiency and scalability of the algorithm are also described in detail.