Adaptive scheduling is an efficient strategy for one-of-a-kind production (OKP) widespread in heavy industries to address its challenges of the high degree of customization and frequent interference. However, the scheduling procedure meets the problems of complex constraints and short decision time. To address these issues, this study aims to develop a reinforcement learning-based algorithm to solve the adaptive scheduling problem of OKP. The objective is to minimize the makespan. Firstly, the OKP adaptive scheduling problem is modeled as a Markov decision process, and a reinforcement learning algorithm is used to train the scheduling agent offline. Then, the trained agent can make scheduling decisions adaptively according to the production state in a short time. To evaluate the effectiveness of the proposed algorithm, a large number of numerical experiments are performed on the benchmark datasets and a practical engineering case. The results show that the proposed algorithm is competitive in static testing. And it can also achieve the balance between scheduling performance and computation time during adaptive scheduling.