Understanding and predicting environmental phenomena often requires the construction of spatio-temporal statistical models, which are typically Gaussian processes. A common assumption made on Gaussian processes is that of covariance stationarity, which is unrealistic in many geophysical applications. In this article, we introduce a deep-learning-inspired approach to construct descriptive nonstationary spatio-temporal models by modeling stationary processes on warped spatio-temporal domains. The warping functions we use are constructed using several simple injective warping units which, when combined through composition, can induce complex warpings. A stationary spatio-temporal covariance function on the warped domain induces covariance nonstationarity on the original domain. Sparse linear algebraic methods are used to reduce the computational complexity when fitting the model in a big data setting. We show that our proposed nonstationary spatio-temporal model can capture covariance nonstationarity in both space and time, and provide better probabilistic predictions than conventional stationary models in both simulation studies and on a real-world data set.