With the growth of personalized demand and the continuous improvement in social productivity, the large-scale and few-variety centralized production model is gradually transitioning towards a personalized model of small batches and multiple varieties, which makes the manufacturing process of the job shop increasingly complex. Furthermore, disruptive events such as machinery failures and rush orders in the job shop increase the uncertainty and variability of the production environment. Traditional scheduling methods are usually based on fixed rules and heuristic algorithms, which are difficult to adapt to constantly changing production environments and demands. This may lead to inaccurate scheduling decisions and hinder the optimal allocation of job shop resources. To solve the dynamic job shop scheduling problem (JSP) more effectively, this paper proposes a Reinforcement Learning (RL) optimization algorithm integrating long short-term memory (LSTM) neural network and proximal policy optimization (PPO). It can dynamically adjust scheduling strategies according to the changing production environment, achieving comprehensive status awareness of the job shop environment to make optimal scheduling decisions. First, a state-aware network framework based on LSTM-PPO is proposed to achieve real-time perception of job shop state changes. Then, the state and action space of the job shop are described within the context of the state-aware network framework. Finally, an experimental environment is established to verify the algorithm’s effectiveness. Training the LSTM-PPO algorithm makes it feasible to achieve better performance than other scheduling methods. By comparing the initial planning time with the actual completion time of the rescheduling decision under different dynamic disturbances, the efficiency of the proposed algorithm is verified for the dynamic JSP