In this paper, we present a novel data-driven optimization approach for trajectory-based air traffic flow management (ATFM). A key aspect of the proposed approach is the inclusion of airspace users’ trajectory preferences, which are computed from traffic data by combining clustering and classification techniques. Machine learning is also used to extract consistent trajectory options, whereas optimization is applied to resolve demand-capacity imbalances by means of a mathematical programming model that judiciously assigns a feasible four-dimensional trajectory and a possible ground delay to each flight. The methodology has been tested on instances extracted from the Eurocontrol data repository. With more than 32,000 flights considered, we solve the largest instances of the ATFM problem available in the literature in short computational times that are reasonable from the practical point of view. As a by-product, we highlight the trade-off between preferences and delays as well as the potential benefits. Indeed, computing efficient solutions to the problem facilitates a consensus between the network manager and airspace users. In view of the level of accuracy of the solutions and the excellent computational performance, we are optimistic that the proposed approach can make a significant contribution to the development of the next generation of air traffic flow management tools.