Abstract

Spatial statistics is the science of the analysis of geo-referenced data and loosely speaking may be divided into the three sub-areas analysis of point processes, analysis of areal data and geostatistics. Point processes naturally arise for example in geophysics, when the locations of earthquakes are noticed, in epidemiology, where new illness cases of certain epidemies are mapped geographically or in biology, where cell centers of a certain tissue are mapped under the microscope. Areal data are data that are attached to areas like number of illness cases in certain medical districts, percent of grassland in a certain county or number of votes in certain political districts. The topic of this article is but geostatistics, the science of continuous stochastic processes or so-called random fields that are defined over some region in 2or 3-dimensional geographic space X or in space-time. A random field {Y(x) : x ∈ X} is a set of random variables or so-called regionalized variables Y(x) that are attached to every location x ∈ X. The probability law of the random field is uniquelly determined by its so-called projective family, the set of all finite dimensional distributions of any finite set of Y(x) obeying symmetrie and consistency with marginal distributions although we will see next that geostatistics most often deals only with the first and second order characteristics of random fields, the trend or mean function m(x) = E(Y(x)) and the covariance function C(x, y) = Cov(Y(x),Y(y)). Most often a linear trend function m(x) = f(x)Tβ, where f(x) is a fixed vector-valued function and β is a regression parameter vector to be estimated, is sufficient for modelling purposes. For x ∈ R2, f(x) could be for example a vector of polynomials in the coordinates x = (x1, x2). The covariance function C(., .) must be positive semidefinite, meaning that it must give any linear combination of Y(x1),Y(x2), . . . ,Y(xn) positive variance. Most often an additional assumption of second-order stationarity must be met, meaning that C(x, x + h) = C(h) is dependent only on lag h and not on the locations x and x+ h themselves. The next stronger assumption is the assumption of isotropy, meaning C(x, x + h) = C(||h||2), where ||h||2 is the Euclidean length of h. Both the assumption of second order stationarity and of isotropy are met in order to make the covariance function estimable from only single realized variables or data y(x1), y(x2), . . . , y(xn). The task of geostatistics is to produce a prediction map of all y(x0), x0 ∈ X based on the available data y(x1), y(x2), . . . , y(xn) and to report on the accuracy of these predictions. The best known methodology for this task of interpolation or map drawing is kriging, also known as best linear unbiased prediction. The so-called universal kriging predictor is dependent on both the data y = (y(x1), y(x2), . . . , y(xn)) T, the covariance matrix K of the corresponding random variables and the covariance vector c0 between these random variables and the random

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call