Graph algorithm is pervasive in many applications ranging from targeted advertising to natural language processing. Recently, Asynchronous Graph Processing (AGP) is becoming a promising model to support graph algorithm on large-scale distributed computing platforms because it enables faster convergence speed and lower synchronization cost than the synchronous model for no barrier between iterations. However, existing AGP methods still suffer from poor performance for inefficient vertex state propagation. In this paper, we propose an effective and low-cost forward and backward sweeping execution method to accelerate state propagation for AGP, based on a key observation that states in AGP can be propagated between vertices much faster when the vertices are processed sequentially along the graph path within each round. Through dividing graph into paths and asynchronously processing vertices on each path in an alternative forward and backward way according to their order on this path, vertex states in our approach can be quickly propagated to other vertices and converge in a faster way with only little additional overhead. In order to efficiently support it over distributed platforms, we also propose a scheme to reduce the communication overhead along with a static priority ordering scheme to further improve the convergence speed. Experimental results on a cluster with 1,024 cores show that our approach achieves excellent scalability for large-scale graph algorithms and the overall execution time is reduced by at least 39.8 percent, in comparison with the most cutting-edge methods.