Model reduction methods aim to describe complex dynamic phenomena using only relevant dynamical variables, decreasing computational cost, and potentially highlighting key dynamical mechanisms. In the absence of special dynamical features such as scale separation or symmetries, the time evolution of these variables typically exhibits memory effects. Recent work has found a variety of data-driven model reduction methods to be effective for representing such non-Markovian dynamics, but their scope and dynamical underpinning remain incompletely understood. Here, we study data-driven model reduction from a dynamical systems perspective. For both chaotic and randomly-forced systems, we show the problem can be naturally formulated within the framework of Koopman operators and the Mori-Zwanzig projection operator formalism. We give a heuristic derivation of a NARMAX (Nonlinear Auto-Regressive Moving Average with eXogenous input) model from an underlying dynamical model. The derivation is based on a simple construction we call Wiener projection, which links Mori-Zwanzig theory to both NARMAX and to classical Wiener filtering. We apply these ideas to the Kuramoto-Sivashinsky model of spatiotemporal chaos and a viscous Burgers equation with stochastic forcing.