Scalable methods for optical transmission performance prediction using machine learning (ML) are studied in metro reconfigurable optical add-drop multiplexer (ROADM) networks. A cascaded learning framework is introduced to encompass the use of cascaded component models for end-to-end (E2E) optical path prediction augmented with different combinations of E2E performance data and models. Additional E2E optical path data and models are used to reduce the prediction error accumulation in the cascade. Off-line training (pre-trained prior to deployment) and transfer learning are used for component-level erbium-doped fiber amplifier (EDFA) gain models to ensure scalability. Considering channel power prediction, we show that the data collection process of the pre-trained EDFA model can be reduced to only 5% of the original training set using transfer learning. We evaluate the proposed method under three different topologies with field deployed fibers and achieve a mean absolute error of 0.16 dB with a single (one-shot) E2E measurement on the deployed 6-span system with 12 EDFAs.