As ecological data sets increase in spatial and temporal extent with the advent of new remote sensing platforms and long-term monitoring networks, there is increasing interest in forecasting ecological processes. Such forecasts require realistic initial conditions over complete spatial domains. Typically, data sources are incomplete in space, and the processes include complicated dynamical interactions across physical and biological variables. This suggests that data assimilation, whereby observations are fused with mechanistic models, is the most appropriate means of generating complete initial conditions. Often, the mechanistic models used for these procedures are very expensive computationally. We demonstrate a rank-reduced approach for ecological data assimilation whereby the mechanistic model is based on a statistical emulator. Critically, the rank-reduction and emulator construction are linked and, by utilizing a hierarchical framework, uncertainty associated with the dynamical emulator can be accounted for. This provides a so-called “weak-constraint” data assimilation procedure. This approach is demonstrated on a high-dimensional multivariate coupled biogeochemical ocean process.