The scaling properties of earthquake populations show remarkable similarities to those observed at or near the critical point of other composite systems in statistical physics. This has led to the development of a variety of different physical models of seismogenesis as a critical phenomenon, involving locally nonlinear dynamics, with simplified rheologies exhibiting instability or avalanche‐type behavior, in a material composed of a large number of discrete elements. In particular, it has been suggested that earthquakes are an example of a “self‐organized critical phenomenon” analogous to a sandpile that spontaneously evolves to a critical angle of repose in response to the steady supply of new grains at the summit. In this stationary state of marginal stability the distribution of avalanche energies is a power law, equivalent to the Gutenberg‐Richter frequency‐magnitude law, and the behavior is relatively insensitive to the details of the dynamics. Here we review the results of some of the composite physical models that have been developed to simulate seismogenesis on different scales during (1) dynamic slip on a preexisting fault, (2) fault growth, and (3) fault nucleation. The individual physical models share some generic features, such as a dynamic energy flux applied by tectonic loading at a constant strain rate, strong local interactions, and fluctuations generated either dynamically or by fixed material heterogeneity, but they differ significantly in the details of the assumed dynamics and in the methods of numerical solution. However, all exhibit critical or near‐critical behavior, with behavior quantitatively consistent with many of the observed fractal or multifractal scaling laws of brittle faulting and earthquakes, including the Gutenberg‐Richter law. Some of the results are sensitive to the details of the dynamics and hence are not strict examples of self‐organized criticality. Nevertheless, the results of these different physical models share some generic statistical properties similar to the “universal” behavior seen in a wide variety of critical phenomena, with significant implications for practical problems in probabilistic seismic hazard evaluation. In particular, the notion of self‐organized criticality (or near‐criticality) gives a scientific rationale for the a priori assumption of “stationarity” used as a first step in the prediction of the future level of hazard. The Gutenberg‐Richter law (a power law in energy or seismic moment) is found to apply only within a finite scale range, both in model and natural seismicity. Accordingly, the frequency‐magnitude distribution can be generalized to a gamma distribution in energy or seismic moment (a power law, with an exponential tail). This allows extrapolations of the frequency‐magnitude distribution and the maximum credible magnitude to be constrained by observed seismic or tectonic moment release rates. The answers to other questions raised are less clear, for example, the effect of the a priori assumption of a Poisson process in a system with strong local interactions, and the impact of zoning a potentially multifractal distribution of epicentres with smooth polygons. The results of some models show premonitory patterns of seismicity which could in principle be used as mainshock precursors. However, there remains no consensus, on both theoretical and practical grounds, on the possibility or otherwise of reliable intermediate‐term earthquake prediction.