In the pursuit of capacity-oriented urban sustainable development (SD), this study aims to enhance the multimodal transport network by making it more accessible, efficient, and environmentally friendly. A sustainable framework combining policy mixes and second-best constraints is developed as a multimodal transportation network capacity (TNC) model with the bi-level programming (BLP) formulation. The upper-level model is to maximise the total origin-destination (OD) demand with considering second-best constraints. The lower-level model is a combined mode split and traffic assignment with elastic demand (CMSTA-ED) model which considers policy mixes. Numerical experiments are conducted to demonstrate the effectiveness of the hybrid PSO–GWO algorithm, the synergy of the method for selecting and mixing policy instruments based on the proposed assessment system, and a methodology to establish a policy mix. The findings of this research offer practical guidance for formulating well-informed and no-redundant policy mixes, contributing to the improvement of sustainable multimodal transportation network performance.