Accurate forecasts and analyses of mortality rates are essential to many practical issues, such as population projections and the design of pension schemes. Recent studies have considered a spatial–temporal autoregressive (STAR) model, in which the mortality rates of each age depend on their own historical values (temporality) and the neighboring cohort ages (spatiality). Despite the realization of age coherence and improved forecasting accuracy over the famous Lee-Carter (LC) model, the assumption of STAR that only the effects of the same and the neighboring cohorts exist can be too restrictive. In this study, we adopt a data-driven principle, as in a sparse vector autoregressive (SVAR) model, to improve the flexibility of the parametric structure of STAR and develop a constrained SVAR (CSVAR) model. To solve its objective function consisting of non-standard L2 and L1 penalties subject to constraints, we develop a new algorithm and prove the existence of the desirable age-coherence in CSVAR. Using empirical data from the United Kingdom, France, Italy, Spain, and Australia, we show that CSVAR consistently outperforms the LC, SVAR, and STAR models with respect to forecasting accuracy. The estimates and forecasts of the CSVAR model also demonstrate important demographic differences between these five countries.