For a number of structural and fatigue problems in marine science, engineers need accurate statistics for kinematic and dynamical quantities, such as displacements and bending moments. For many such problems the response statistics depend nonlinearly on input stochastic processes, such as sea surface elevation. Nonlinearity prevents the use of standard linear methods, leading to the adoption of expensive experiments and time-consuming numerical simulations. In order to avoid this cost we present a machine learning framework to minimize the training set requirements. This framework consists of two parts. First, we use the Karhunen–Loève theorem to represent stochastic sea states with finite-time wave episodes, which have low dimensionality that nonetheless captures the features important to hydrodynamics and structural mechanics. However, the choice of the wave episode duration is ‘caught’ between the Scylla of slow Karhunen–Loève series convergence for long time durations and the Charybdis of missing transient behaviors when the interval is short. To combat this dilemma, we propose a division into a region, designed for parametric interpolation and machine learning, and a stochastic prelude region, designed to probabilistically model transients. The second part of the framework consists of a Gaussian Process Regression (GPR) model designed to learn the mapping from wave episodes to structural outputs. GPR is able to take advantage of the low dimensional parametric representation of the sea state in order to converge with reasonably-sized training sets (on the order n≈100). At the same time, we use a low-dimensional representation in order to represent the stochastic response time series. The principal advantages of the Gaussian process surrogate are the blazing speed of evaluation – ten thousand times faster than the direct method – and the built in uncertainty quantification. Taken together, we can reconstruct the statistics of the responses by sampling sea states via the Karhunen–Loève construction, estimating the corresponding model outputs using the trained GPR, and estimating statistics of interest through Monte-Carlo calculation on the surrogate model. Finally the developed framework allows for trivial computation of the statistics for different sea spectra, i.e. without the need to retrain the Gaussian Process Regression scheme.