Efficient planning of airport capacity is key for the successful accomplishment of traffic flow management. Yet, the dynamic and uncertain behavior of capacity-determining factors makes it difficult to estimate flow rates precisely, especially for strategic planning horizons. Metroplex systems impose additional challenges in this decision-making process because of relevant operational interdependencies between the closely located airports. This paper presents a data-driven framework to identify, characterize, and predict traffic flow patterns in the terminal area of multi-airport systems toward improved capacity planning decision support in complex airspace. Through the identification and characterization of patterns in the terminal area traffic flows, we learn recurrent utilization patterns of runways and airspace as well as relevant decision factors, and use that knowledge to develop descriptive models for metroplex configuration prediction and capacity estimation. The framework is based on the application of machine learning methods on historical flight tracks, weather forecasts, and airport operational data. A multi-layer clustering analysis is first performed to mine spatial and temporal trends in flight trajectory data for identification of traffic flow patterns. Based on this knowledge, a multi-way classification model is developed to generate probabilistic forecasts of the metroplex traffic flow structure for look-ahead times of up to eight hours. Finally, an empirical approach for arrival capacity estimation is proposed based on historical flow pattern behavior. The observed variability in throughput and terminal area delay performance emphasizes the importance of metroplex configuration predictability toward improved flow rate planning and ultimately better traffic regulation.