The Gutenberg–Richter (G–R) relationship can be derived as the Gibbs distribution. For a given earthquake set (all earthquakes in a given region, time period, magnitude range, tectonic settings) the Gibbs probability density function for magnitudes, with a given b value in its exponent, is the most uniform distribution under the constraints of the magnitude range and mean value. Therefore, it represents our limited knowledge about the system output: the only pieces of information are the mean value and the magnitude range. Honest earthquake forecasts can be based on such a distribution, since it represents all and only available information about the seismic system. The b value can change among different earthquake sets (in time, space, magnitude ranges, or tectonic settings), since it is related to earthquake rupture dynamics, or seismic source characteristics. The relationship between the b value and the exponent $$\beta$$ in the rupture area vs. maximum slip scaling, $$A\propto D^{\beta }$$, results from viewing earthquake recurrence time in connection with the slip budget. This makes a link between earthquake statistics (the G–R law) and physics (fault characteristics). Specifically, the relationship enables us to explain different ranges of b values at megathrust faults, in dependence on interplate asperity and coupling distributions, as well as on amounts of sediments and fluids in subduction channels. The approach differs from common interpretations of the G–R law in that the b value becomes a field variable, not a constant. It is always the Gibbs distribution for a given magnitude range that we use due to our ignorance about the system outcome, and it is the b value that variates, depending on our knowledge about the system physics. This is important for seismic forecasts, which are mostly based on the G–R relationship. First, because the physical processes leading to the largest earthquakes can be revealed by observing the b value variations. Second, because earthquake generation process can be thought of as sampling with constraints, where the b value and the magnitude range are the constraints.