The formation of congestion on an urban road network is a key issue for the development of sustainable mobility in future smart cities. In this work, we propose a reductionist approach by studying the stationary states of a simple transport model using a random process on a graph, where each node represents a location and the link weights give the transition rates to move from one node to another, representing the mobility demand. Each node has a maximum flow rate and a maximum load capacity, and we assume that the average incoming flow equals the outgoing flow. In the approximation of the single-step process, we are able to analytically characterize the traffic load distribution on the single nodes using a local maximum entropy principle. Our results explain how congested nodes emerge as the total traffic load increases, analogous to a percolation transition where the appearance of a congested node is an independent random event. However, using numerical simulations, we show that in the more realistic case of synchronous dynamics for the nodes, entropic forces introduce correlations among the node states and favor the clustering of empty and congested nodes. Our aim is to highlight the universal properties of congestion formation and, in particular, to understand the role of traffic load fluctuations as a possible precursor of congestion in a transport network.