Real-time traffic speed prediction is an essential component of intelligent transportation systems applications on large-scale urban networks, e.g., proactive traffic management, advanced information provision, and prompt incident response. The family of traffic prediction models (e.g., convolutional neural networks) based on multi-detector speed diagrams in the time-space plane has been one of the most frequently used approaches for individual roads and the entire network. However, the predefined stacking sequence of traffic detectors along the spatial dimension of the speed diagram has a significant influence on the prediction performance, which makes network-wide speed prediction more challenging. To tackle the above challenge and better capture complicated traffic dynamics, we propose a novel speed prediction approach, named spatial-temporal deep tensor neural networks (ST-DTNN), for a large-scale urban network with mixed road types. Spatial and temporal dependencies of different road segments are simultaneously taken into account to improve the network-wide prediction accuracy. A scalable deep tensor is constructed for the ST-DTNN to eliminate the potentially negative impact caused by the manually stacking sequence of speed time series collected at different locations. Multi-step ahead traffic speeds can be simultaneously predicted based on probe data for a real-world large-scale urban network with hundreds of detectors installed on freeways, highways, and major/minor arterials. The results demonstrate the capability and effectiveness of the proposed ST-DTNN approach. Compared with the benchmark models, the ST-DTNN performs higher prediction accuracy during either peak or off-peak periods within an acceptable training time and has more stable prediction performance on the spatial scale. The proposed approach can be extended to develop network-wide traffic state monitoring, optimize routing in navigation services, and support congestion mitigation.