Traffic flow prediction is an integral part of an intelligent transportation system (ITS) for proactive transportation planning and management in public transit network systems. However, due to the influence of passengers’ travel modes and the difficulty of data collection, some traffic flow datasets are not only nonstationary in temporal distribution but also insufficient in data size, leading to inaccurate prediction results. Therefore, a transfer learning-based method is proposed to predict nonstationary traffic flow with inadequate data in the public transit system. In the proposed model, a novel approach named AdaRNN-DCORAL is presented to achieve accurate traffic flow prediction in the temporal dimension. Specifically, AdaRNN is an adaptive prediction network for nonstationary traffic flow data, while DCORAL performs domain adaptation between the source and target domain in the public transit system. First, two domains of traffic flow datasets in two public transit modes are established, which are the target domain with insufficient data and the source domain with sufficient data. Subsequently, the traffic flow datasets are divided into weekday and holiday time series based on their distribution characteristics. On this basis, AdaRNN-DCORAL is trained on the constructed datasets, thus achieving the precise nonstationary prediction of data scarcity in the target domain. Finally, the experiments demonstrate the improved performance of the proposed method over that of state-of-the-art models in terms of prediction accuracy and computation time.