There has been a significant increase in consumer credit worldwide in recent years. The scheduling of jobs in credit factories is essential for speeding up the loan application process, which can improve the efficiency of credit factories. In this study, we propose a reinforcement learning approach for addressing the scheduling problem in credit factories, which is a stochastic flexible flow shop scheduling problem (SFFSP). First, we propose a mathematical model for the credit factory stochastic flexible flow shop scheduling problem, which abstracts the decision-making process as a semi-Markov process. Then, a reinforcement learning reward mechanism is designed based on the proposed mathematical model. After that, a self-attention neural network is used to extract state information from global and local multidimensional data, enabling each decision to consider the state of the entire process and make a decision that aligns with the global goal. Meanwhile, Monte Carlo Tree Search (MCTS) is utilised to enhance the training effect and sample utilisation of reinforcement learning. Finally, we conduct extensive experiments and demonstrate that our method achieves better performance for SFFSP in credit factories compared to other approaches.