Abstract Understanding ENSO dynamics has tremendously improved over past decades. The ENSO diversity in spatial pattern, peak intensity, and temporal evolution is however still poorly represented in conceptual ENSO models. In this paper, a physics-informed auto-learning framework is applied to derive ENSO stochastic conceptual models with varying degrees of freedom. The framework is computationally efficient and easy to apply. Once the state vector of the target model is set, causal inference is exploited to build the right-hand side of the equations based on a mathematical function library. Fundamentally different from standard nonlinear regression, the auto-learning framework provides a parsimonious model by retaining only terms that improve the dynamical consistency with observations. It can also identify crucial latent variables and provide physical explanations. This methodology successfully reconstructs the equations of a realistic six-dimensional reference ENSO model based on the recharge oscillator theory from its data. A hierarchy of lower-dimensional models is derived and their representation of ENSO (including its diversity) systematically assessed. The minimum model that represents ENSO diversity is four-dimensional, with three interannual variables describing the western Pacific thermocline depth, the eastern and central Pacific sea surface temperatures (SSTs), and one intraseasonal variable for westerly wind events. Without the intraseasonal variable, the resulting three-dimensional model underestimates extreme events and is too regular. The limited number of weak nonlinearities in the model are essential in reproducing the observed extreme El Niño events and the observed nonlinear relationship between eastern and western Pacific SSTs.