Accurately predicting traffic flow is a crucial upstream technique in intelligent transportation systems for future travel plans, the efficiency of urban transport, and the regulation of transport departments, etc. The mainstream spatiotemporal graph convolutional neural networks are usually based on prior knowledge to predefine adjacency matrix graphs for spatial dependencies of the road network. However, modeling spatial correlation statically limits these models to accurately predict traffic flow, since the spatial correlations of road segments change over time. To address these issues, we propose a spatiotemporal gated transformer network with a graph latent information learning structure, termed GL-STGTN, for spatiotemporal traffic flow forecasting. First, we propose a graph latent information learning structure to dynamically learn the spatial dependencies for road network conditions from global and local learning perspectives. Second, we design a spatiotemporal gated transformer network (STGTN) block, which consists of a localized geographically aware block to extract the local embedding of spatial correlations and a temporal-aware enlarged block to extract local temporal dependencies. The learned spatial and temporal feature embeddings are further aggregated in a spatial multi-head attention module and a temporal multi-head attention module, respectively. In the end, a spatiotemporal fusion layer fuses the spatial and temporal information from the stacked STGTN blocks. Experiments on two public real-world benchmark datasets show that our model outperforms six state-of-the-art models for multi-step traffic flow forecasting.