AbstractMachine learning-based data-driven methods are increasingly being used to extract structures and essences from the ever-increasing pool of geoscience-related big data, which are often used in relation to the atmosphere, oceans, and land surfaces. This study focuses on applying a data-driven forecast model to the classical ensemble Kalman filter process to reconstruct, analyze, and elucidate the model. In this study, a nonparametric sampler from a catalog of historical datasets, namely, a nearest neighbor or analog sampler, is given by numerical simulations. Based on this catalog (sampler), the dynamics physics model is reconstructed using theK-nearest neighbors algorithm. The optimal values of the surrogate model are found, and the forecast step is performed using locally weighted linear regression. Several numerical experiments carried out using the Lorenz-63 and Lorenz-96 models demonstrate that the proposed approach performs as good as the ensemble Kalman filter for larger catalog sizes. This approach is restricted to the ensemble Kalman filter form. However, the basic strategy is not restricted to any particular version of the Kalman filter. It is found that this combined approach can outperform the generally used sequential data assimilation approach when the size of the catalog is substantially large.