Abstract The rising demand for air traffic will inevitably result in a surge in both the number and complexity of flight conflicts, necessitating intelligent strategies for conflict resolution. This study addresses the critical challenges of scalability and real-time performance in multi-aircraft flight conflict resolution by proposing a comprehensive method that integrates a priority ranking mechanism with a conflict resolution model based on the Markov decision process (MDP). Within this framework, the proximity between aircraft in a multi-aircraft conflict set is dynamically assessed to establish a conflict resolution ranking mechanism. The problem of multi-aircraft conflict resolution is formalised through the MDP, encompassing the design of state space, discrete action space and reward function, with the transition function implemented via simulation prediction using model-free methods. To address the positional uncertainty of aircraft in real-time scenarios, the conflict detection mechanism introduces the aircraft’s positional error. A deep reinforcement learning (DRL) environment is constructed incorporating actual airspace structures and traffic densities, leveraging the Actor Critic using Kronecker-factored Trust Region (ACKTR) algorithm to determine resolution actions. The experimental results indicate that with 20–30 aircraft in the airspace, the success rate can reach 94% for the training set and 85% for the test set. Furthermore, this study analyses the impact of varying aircraft numbers on the success rate within a specific airspace scenario. The outcomes of this research provide valuable insights for the automation of flight conflict resolution.