Accurate hydrological predictions are often hindered by the lack of stream gauges in data-scarce regions, where traditional transfer learning (TL) models like Long Short-Term Memory (LSTM) networks often face limitations due to reduced accuracy and adaptability. To enhance runoff prediction in such regions, we developed DAformer, a novel TL approach that integrates domain adversarial neural networks with the Informer model. Trained on comprehensive runoff data from U.S. basins, DAformer was applied to three basins in Chile and the Chaersen basin in China, demonstrating an effective transfer from data-rich to data-scarce environments. Results show that DAformer significantly outperforms LSTM-based models, improving forecast accuracy by 16.1% for 1-day lead time and by 100.5% for 5-day lead time. These improvements indicate that the DAformer model not only enhances prediction accuracy but also holds substantial practical implications for flood risk management and water resource planning in regions with limited data availability. By clustering basins based on Shuttle Radar Topography Mission (SRTM) and other geographical data, we found that relying on multiple source basins further enhances the performance. DAformer, therefore, serves as a robust and scalable method for enhancing runoff prediction for regions with limited data.