Air traffic flow managers are continually faced with the decision of when and how to respond to predictions of future constraints. The promise of artificial intelligence, and specifically reinforcement learning, to provide decision support in this domain stems from the ability to systematically evaluate a sequence of potential actions, or strategy, across a range of uncertain futures. As decision support for human traffic managers, the generated recommendations must embody characteristics of a good management strategy; doing so requires introducing such notions to the algorithm. This paper proposes inducing stability into the strategy by dynamically constraining the design space based on upstream design decisions to promote consistency in the recommendations over time, where two such constraint sets are considered. The paper further evaluates the impact of adding a performance improvement threshold that must be overcome to accept a new strategy recommendation. The combination of search constraints and threshold values is evaluated against the agent’s reward function in addition to measures proposed to capture the stability of the strategy. The results show that the more restrictive set of constraints yields the best performance in terms of strategy stability and is more likely to reduce the delay where implementation of the threshold has a minor impact on overall performance. However, for the highest impact day of 8 June 2018, applying the threshold reverses the performance gains in delay but dramatically improves the stability of the resulting traffic flow management strategy from a flight level perspective, implying a potential tradeoff between delay optimization and flight predictability.
Read full abstract