Two broad classes of aggregate regional forecasting model are defined: autoprojective models belonging to the general family of space-time autoregressive-moving average processes, and explanatory (including leading indicator) models that constitute the class of space-time regression or transfer function processes. A procedure based on the estimation of space-time correlation functions is developed for identifying (selecting) operational models from these classes. For a given body of data, both the degree of non-stationarity and the type and order of representative model can be decided by studying the shape of the appropriate correlation functions. Using artificial and actual data series, the procedure is shown to perform well. The space-time autocorrelation functions give a clear indication of the sort of autoprojective process generating the data. Although less straightforward and often requiring data pre-whitening, the space-time cross-correlation function appears to be quite successful in identifying explanatory (transfer function) models that are plausible approximations to the dynamic relationships underlying the observed series. THE need to construct empirical spatial forecasting models in the context of urban and regional analysis is becoming increasingly recognized (see, for example, Chisholm, Frey and Haggett, 197 i). An appropriate class of formal model for describing space-time data series and generating optimal forecasts is provided by linear stochastic difference equations. The process of modelbuilding is concerned with relating such a class of statistical models to the data at hand and involves much more than model fitting. Forecasting procedures could be seriously deficient if these models were either inadequate or unnecessarily excessive in the use of parameters. Thus an essential stage in the model-building process is that of model identification. By identification we mean preliminary analysis of the data to suggest what particular kind of model might be worthy of further investigation.' The specific aim is to derive aggregate models possessing maximum simplicity and the smallest number of parameters consonant with representational accuracy. That is, the objective is to obtain adequate but parsimonious models (Tukey, 1961; Box and Jenkins, I970). The identification of spatial-temporal forecasting models (and spatial models in general) is basically a 'best subset problem' of the sort frequently encountered in multivariate analysis (for example, Beale, Kendall and Mann, 1967). A variety of techniques is available for use in searching for parsimonious model structures, for example multiple step-wise regression (Newbold, 1972; Payne, I973) and methods of canonical reduction (Daling and Tamura, 1970; Hawkins, 1973). However, in this paper we develop an identification technique based on the estimation of space-time correlation functions. This procedure has the advantage of providing insight into the nature of the process generating the space-time series for a given variable as well as suggesting a 95 This content downloaded from 207.46.13.111 on Sun, 07 Aug 2016 05:15:22 UTC All use subject to http://about.jstor.org/terms R.L.MARTIN AND J.E.OEPPEN sub-class of models appropriate for that process. It should be stressed at the outset that such identification is not exact and offers only a guide to the choice of tentative models that can then be subjected to rigorous estimation and diagnostic checking. In the first section we propose a broad class of space-time forecasting models structured in terms of linear difference equations. The second section develops a general framework for model identification in terms of space-time autocorrelation and cross-correlation functions. The application of these procedures to artificially generated and actual data series is discussed in the third and fourth sections. SOME AGGREGATE SPACE-TIME FORECASTING MODELS The various components of a space-time forecasting model include time dependence and spatial dependence lagged in time, and the possibility of (repeated) differencing in time and space to allow for temporal and spatial non-stationarity. Assume that data are available for i = 1,2, ... n zones over t = I,2,. .., T time periods, plus any earlier lagged values that may be required. We use the term 'zone' in the wide sense to define an areal aggregate such as a county, enumeration district, employment exchange area etc., or a point location, such as a town. Suppose the variable of interest in zone i at time t is Yi,, which takes on values denoted by yi,. The models under consideration are linear equations which study the expected value of Y,, defined as y*,, conditional upon the observed values of Y in zone i and certain other zones in previous time periods and/or upon the lagged values of one or more relevant explanatory variables Xi, X2, etc. The use of such an equation for forecasting is then obvious. To assist in the models that follow, we first make the following definition: Let L be a spatial lag operator such that L?yit = Yit and (I) Lyit = E ij Yjt for any s > jeJs where Js is the set of zones at spatial lag s from zone i, and the {wij} are weights scaled so that