Recent work has shown that machine learning (ML) models can skillfully forecast the dynamics of unknown chaotic systems. Short-term predictions of the state evolution and long-term predictions of the statistical patterns of the dynamics (“climate”) can be produced by employing a feedback loop, whereby the model is trained to predict forward only one time step, then the model output is used as input for multiple time steps. In the absence of mitigating techniques, however, this feedback can result in artificially rapid error growth (“instability”). One established mitigating technique is to add noise to the ML model training input. Based on this technique, we formulate a new penalty term in the loss function for ML models with memory of past inputs that deterministically approximates the effect of many small, independent noise realizations added to the model input during training. We refer to this penalty and the resulting regularization as Linearized Multi-Noise Training (LMNT). We systematically examine the effect of LMNT, input noise, and other established regularization techniques in a case study using reservoir computing, a machine learning method using recurrent neural networks, to predict the spatiotemporal chaotic Kuramoto–Sivashinsky equation. We find that reservoir computers trained with noise or with LMNT produce climate predictions that appear to be indefinitely stable and have a climate very similar to the true system, while the short-term forecasts are substantially more accurate than those trained with other regularization techniques. Finally, we show the deterministic aspect of our LMNT regularization facilitates fast reservoir computer regularization hyperparameter tuning.