10  Continuously varying random fields

10.1 Introduction

As discussed in the previous chapter, we often think of a spatial dataset as being a partial observation of a latent random field that varies continuously in space. It will be helpful to think a bit more carefully about what we mean by random fields and how they should be characterised. Essentially, a random field is a random function whose domain is the spatial domain. That is, for each \(\mathbf{s}\in\mathcal{D}\subset\mathbb{R}^d\) (for some \(d\in\mathbb{N}\), and typically \(d=1,2\) or \(3\)), \(Z(\mathbf{s})\) is a random variable over the sample space \(\mathcal{Z}\subset\mathbb{R}\). A realisation of the random field will be the collection of realisations of the random variables, \(\{z(\mathbf{s})|\mathbf{s}\in\mathcal{D}\}\), and hence a (deterministic) function \(\mathcal{D}\rightarrow\mathcal{Z}\). The variable \(z(\cdot)\) is sometimes said to be regionalised, in order to emphasise its dependence on a continuously varying spatial parameter.

We typically characterise a random variable by its probability distribution, but there is a technical difficulty here, in that we have a distinct (though typically not independent) random quantity at each point in a continuously varying space, and so our random variable is infinite dimensional. Defining probability distributions over such variables is not completely straightforward. Nevertheless, it seems intuitively reasonable that if we are able to specify the joint distribution for any arbitrary finite collection of sites, then this should completely determine the full distribution of the random field. This turns out to be true provided that some reasonable consistency conditions are satisfied. We can define the cumulative distribution function for a finite number of sites as \[ F(z_1,z_2,\ldots,z_n;\mathbf{s}_1,\mathbf{s}_2,\ldots,\mathbf{s}_n) = \mathbb{P}[Z(\mathbf{s}_1)\leq z_1,Z(\mathbf{s}_2)\leq z_2, \ldots,Z(\mathbf{s}_n)\leq z_n]. \] In order to be consistent, \(F(\cdot)\) should be symmetric under a permutation of the site indices (it shouldn’t matter what order the sites are listed), and further, \(F(\cdot)\) should be consistent under marginalisation. That is, \[ F(\infty,z_2,\ldots,z_n;\mathbf{s}_1,\mathbf{s}_2,\ldots,\mathbf{s}_n) = F(z_2,\ldots,z_n;\mathbf{s}_2,\ldots,\mathbf{s}_n). \] Provided an \(F(\cdot)\) can be defined for all \(n\in\mathbb{N}\) in this consistent way, then \(F(\cdot)\) determines the probability distribution of a random field whose finite dimensional distributions are consistent with \(F(\cdot)\). This result is known as Kolmogorov’s extension theorem (AKA the Kolmogorov existence theorem and sometimes as the Kolmogorov consistency theorem).

Not all random fields have finite moments, but when the first moments exist, it is useful to define \(\mu:\mathcal{D}\rightarrow\mathbb{R}\) by \[ \mu(\mathbf{s}) = \mathbb{E}[Z(\mathbf{s})], \quad \forall \mathbf{s}\in\mathcal{D}. \] Similarly, when second moments exist, it is useful to define the covariance function \(C:\mathcal{D}\times\mathcal{D}\rightarrow\mathbb{R}\) by \[ C(\mathbf{s},\mathbf{s}') = \mathbb{C}\mathrm{ov}[Z(\mathbf{s}), Z(\mathbf{s}')], \quad \forall \mathbf{s},\mathbf{s}'\in\mathcal{D}. \] The covariance function must define a positive-definite kernel. This means that for any finite collection of sites, \(\mathbf{s}_1,\mathbf{s}_2,\ldots,\mathbf{s}_n\in\mathcal{D}\), and any \(n\)-vector \(\pmb{\alpha}=(\alpha_1,\alpha_2,\ldots,\alpha_n)^\top\in\mathbb{R}^n\) we must have \[ \sum_{j=1}^n \sum_{k=1}^n \alpha_j\alpha_k C(\mathbf{s}_j,\mathbf{s}_k) \geq 0. \] This is clearly required since \[ \sum_{j=1}^n \sum_{k=1}^n \alpha_j\alpha_k C(\mathbf{s}_j,\mathbf{s}_k) = \mathbb{V}\mathrm{ar}\left[ \sum_{j=1}^n \alpha_j Z(\mathbf{s}_j) \right] \geq 0. \] When first and second moments exist, the field is called a second-order random field.

10.2 Stationarity

In the context of time series analysis, we found stationary time series models to be useful building blocks for creating more realistic time series models. The same is true in spatial statistics. For most real, interesting spatial datasets, a stationarity assumption is not reasonable, but a stationary component is often invaluable in the construction of a more appropriate model. A time series model is stationary if its distribution is invariant under a shift in time. A random field is stationary if its distribution is invariant under a shift in space.

A random field is strictly stationary iff \[ F(z_1,z_2,\ldots,z_n;\mathbf{s}_1+\mathbf{h},\mathbf{s}_2+\mathbf{h},\ldots,\mathbf{s}_n+\mathbf{h}) = F(z_1,z_2,\ldots,z_n;\mathbf{s}_1,\mathbf{s}_2,\ldots,\mathbf{s}_n) \] for all finite collections of sites and all shift vectors \(\mathbf{h}\in\mathcal{D}\).

10.3 Weak stationarity

Strict stationarity is an extremely powerful property for a process to have, but just as for time series, a weaker notion of stationarity is useful in practice. A second order random field is weakly stationary (AKA second-order stationary) iff

  • \(\mu(\mathbf{s}) = \mu(\mathbf{s}+\mathbf{h}) = \mu,\quad \forall \mathbf{s},\mathbf{h}\in\mathcal{D}\),
  • \(C(\mathbf{s}, \mathbf{s}+\mathbf{h}) = C(\mathbf{s}', \mathbf{s}'+\mathbf{h}) = C(\mathbf{0}, \mathbf{h}),\quad \forall \mathbf{s},\mathbf{s}',\mathbf{h}\in\mathcal{D}\).

It is clear that a strictly stationary second-order random field will also be weakly stationary. For weakly stationary random fields we define the function \(C:\mathcal{D}\rightarrow\mathbb{R}\) by \[ C(\mathbf{h}) \equiv C(\mathbf{0},\mathbf{h}). \] This function is the equivalent of the auto-covariance function from time series analysis. In the context of spatial statistics, this function is known as the covariogram. Although we are using \(C\) for two (slightly) different functions, the different domains allow disambiguation. It is clear that the covariogram determines the covariance function, since \(C(\mathbf{s},\mathbf{s}')=C(\mathbf{s}'-\mathbf{s})\). Some basic properties of the covariogram derive directly from properties of covariance.

  • \(\mathbb{V}\mathrm{ar}[Z(\mathbf{h})] = C(\mathbf{0}),\ \forall \mathbf{h}\in\mathcal{D}\)
  • \(C(\mathbf{h}) = C(-\mathbf{h}),\ \forall \mathbf{h}\in\mathcal{D}\)
  • \(|C(\mathbf{h})| \leq C(\mathbf{0}),\ \forall \mathbf{h}\in \mathcal{D}\)
  • \(C\) is a positive semi-definite function. That is, for any finite collection of sites, \(\mathbf{s}_1,\mathbf{s}_2,\ldots,\mathbf{s}_n\in\mathcal{D}\), and any \(n\)-vector \(\pmb{\alpha}=(\alpha_1,\alpha_2,\ldots,\alpha_n)^\top\in\mathbb{R}^n\) we have \[ \sum_{j=1}^n \sum_{k=1}^n \alpha_j\alpha_k C(\mathbf{s}_j-\mathbf{s}_k) \geq 0. \]

10.4 Intrinsic stationarity

Sometimes even weak stationarity is too strong a condition. Some processes of interest in geostatistics are not stationary, but do have stationary (and zero mean) increments. A second order process is intrinsically stationary iff

  • \(\mathbb{E}[Z(\mathbf{s}+\mathbf{h})-Z(\mathbf{s})] = 0, \ \forall \mathbf{s},\mathbf{h}\in\mathcal{D}\),
  • \(\mathbb{V}\mathrm{ar}[Z(\mathbf{s}+\mathbf{h})-Z(\mathbf{s})] = \mathbb{V}\mathrm{ar}[Z(\mathbf{s}'+\mathbf{h})-Z(\mathbf{s}')] = \mathbb{V}\mathrm{ar}[Z(\mathbf{h})-Z(\mathbf{0})],\ \forall \mathbf{s},\mathbf{s}',\mathbf{h}\in \mathcal{D}\).

It is clear that weakly stationary processes are intrinsically stationary, but the converse is not true in general. Note that the Wiener process is a simple 1-d example of a random field that is intrinsically stationary but not weakly stationary.

We define the semi-variogram function, \(\gamma:\mathcal{D}\rightarrow\mathbb{R}\) by \[ \gamma(\mathbf{h}) \equiv \frac{1}{2}\mathbb{V}\mathrm{ar}[Z(\mathbf{h})-Z(\mathbf{0})], \] and \(2\gamma(\mathbf{h})\) is the variogram. Note that this use of \(\gamma(\cdot)\) for the semi-variogram is ubiquitous in geostatistics, but somewhat conflicts with its use as an auto-covariance function in time series analysis. Fortunately, it is usually clear from the context which is intended. In the case of an intrinsically stationary process, we have \[ \gamma(\mathbf{h}) = \frac{1}{2}\mathbb{V}\mathrm{ar}[Z(\mathbf{s}+\mathbf{h})-Z(\mathbf{s})], \] where the RHS is independent of \(\mathbf{s}\), and \(\mathbb{V}\mathrm{ar}[Z(\mathbf{s}+\mathbf{h})-Z(\mathbf{s})]\) is the variogram. Note that intrinsically stationary processes have constant mean (say, \(\mathbb{E}[Z(\mathbf{s})]=\mu\)). Unfortunately, the mean and variogram do not completely characterise an intrinsically stationary process, since \(Z(\mathbf{s})\) and \(Z(\mathbf{s})+U\) (where \(U\) is a mean zero random quantity independent of \(Z(\cdot)\)) both have exactly the same increments, the same mean, and the same variogram, but different variances. This problem is usually addressed by fixing (or registering) the value of the field at a single point in space, \(\mathbf{s}_0\in\mathcal{D}\), ie. \(Z(\mathbf{s}_0)\equiv\mu\). This is exactly analogous to fixing the value of a Wiener process to be zero at time zero.

The variogram is a useful concept because it is well-defined for all intrinsically stationary processes, whereas the covariogram is only well-defined for weakly stationary processes. For a process that is weakly stationary, it is clear that the covariogram and variogram are related by \[ \gamma(\mathbf{h}) = C(\mathbf{0}) - C(\mathbf{h}), \quad \forall\mathbf{h}\in\mathbf{D}, \tag{10.1}\] since \(2\gamma(\mathbf{h}) = \mathbb{V}\mathrm{ar}[Z(\mathbf{h})-Z(\mathbf{0})] = C(\mathbf{0})+C(\mathbf{0})-2C(\mathbf{h})\). This allows direct computation of \(\gamma(\cdot)\) from \(C(\cdot)\). To compute \(C(\cdot)\) from \(\gamma(\cdot)\), the stationary variance of the process, \(C(\mathbf{0})\), must also be known. It is worth noting that since \(|C(\mathbf{h})|<C(\mathbf{0})\), \(\gamma(\mathbf{h})\) is bounded above by \(2C(\mathbf{0})\). Consequently, weakly stationary processes cannot have unbounded variograms, and therefore any intrinsically stationary process with an unbounded covariogram cannot be weakly stationary.

In the case of a stationary process, it is relatively straightforward to map back and forth between the (semi-)variogram and the covariogram. For an intrinsically stationary process, there is no covariogram, but mapping back and forth between the variogram and the covariance function is often of interest. We will start by computing the variogram from the covariance function, since that is simpler. \[\begin{align*} 2\gamma(\mathbf{h}) &= \mathbb{V}\mathrm{ar}[Z(\mathbf{s}+\mathbf{h})-Z(\mathbf{s})] \\ &= \mathbb{V}\mathrm{ar}[Z(\mathbf{s}+\mathbf{h})] + \mathbb{V}\mathrm{ar}[Z(\mathbf{s})] - 2\mathbb{C}\mathrm{ov}[Z(\mathbf{s}+\mathbf{h}), Z(\mathbf{s})] \end{align*}\] and hence \[ 2\gamma(\mathbf{h}) = C(\mathbf{s}+\mathbf{h}, \mathbf{s}+\mathbf{h}) + C(\mathbf{s}, \mathbf{s}) - 2C(\mathbf{s}+\mathbf{h},\mathbf{s}). \tag{10.2}\] We can use this to compute the variogram from the covariance function. Since the RHS is independent of \(\mathbf{s}\), it will often be convenient to evaluate this at \(\mathbf{s}=\mathbf{0}\), \[ 2\gamma(\mathbf{h}) = C(\mathbf{h}, \mathbf{h}) + C(\mathbf{0}, \mathbf{0}) - 2C(\mathbf{h},\mathbf{0}). \] In general the variogram is insufficient to determine the covariance function, due to the registration problem. However, if we know the value of the field at some location, then it does become possible to determine the covariance function. It is most convenient if the field is known at the origin, ie. fix \(Z(\mathbf{0})=\mu\). The above equation then simplifies to \[ 2\gamma(\mathbf{h}) = C(\mathbf{h},\mathbf{h}), \] which we can substitute back into Equation 10.2 and rearrange to get \[ C(\mathbf{s}+\mathbf{h}, \mathbf{s}) = \gamma(\mathbf{s}+\mathbf{h})+\gamma(\mathbf{s})-\gamma(\mathbf{h}), \] or alternatively, \[ C(\mathbf{s}',\mathbf{s}) = \gamma(\mathbf{s}')+\gamma(\mathbf{s}) - \gamma(\mathbf{s}'-\mathbf{s}). \tag{10.3}\] So, provided that the field is fixed at the origin, it is actually very straightforward to compute the covariance function from the variogram. It is in this sense that the variogram almost completely characterises an intrinsically stationary random field.

TODO: some more properties of the variogram

10.4.1 Examples in 1-d

Examples in 1-d can be considered as either having a single spatial dimension, or as being a model for a process in continuous time. We mentioned a few such processes briefly during the time series part of the course.

10.4.1.1 Wiener process (Brownian motion)

The Wiener process (mentioned briefly in Chapter 2) is typically defined to have \(W(0)=0\) and \(\mathbb{V}\mathrm{ar}[W(s)-W(t)] = |s-t|,\ s,t\in\mathbb{R}\) (and independent mean zero Gaussian increments). It is therefore intrinsically stationary with semi-variogram \[ \gamma(h) = \frac{1}{2}\mathbb{V}\mathrm{ar}[W(h)-W(0)] = \frac{1}{2}|h|. \] Since it is fixed at zero, we can easily compute its covariance function as \[\begin{align*} C(s, t) &= \gamma(s) + \gamma(t) - \gamma(s-t) \\ &= \frac{1}{2}\left(|s|+|t|-|s-t|\right) \\ &= \min\left\{|s|,|t|\right\}. \end{align*}\]

10.4.1.2 Fractional Brownian motion (fBM)

The Wiener process has a variogram that increases linearly. It is interesting to wonder about processes with a variogram that increases non-linearly, as a fractional power. That is, a process for which \[ \mathbb{V}\mathrm{ar}[Z(t+h)-Z(t)] = |h|^{2H}, \] where the parameter \(H\in(0,1)\) is known as the Hurst index. Clearly the choice \(H=1/2\) gives the Wiener process, but other choices are associated with an intrinsically stationary process known as fractional Brownian motion. Since the variogram is unbounded, we know that the process can not be stationary, but using \[ \gamma(h) = \frac{1}{2}|h|^{2H}, \] if we fix \(Z(0)=0\) we can compute the covariance function as \[ C(s, t) = \frac{1}{2}\left( |s|^{2H} + |t|^{2H} - |s-t|^{2H} \right). \] It is worth noting that for \(H\not=1/2\), the increments of this process are stationary but not independent. We will revisit this process in Chapter 11.

10.4.1.3 The OU process

We introduced the OU process in Section 2.3.2.3. By construction this is a Markov process with independent increments, and we showed that for reversion parameter \(\lambda>0\), the process is stationary, with covariogram \[ C(t) = \frac{\sigma^2}{2\lambda}e^{-\lambda|t|}, \] known as the exponential covariance function. We can use Equation 10.1 to compute the semi-variogram as \[ \gamma(t) = \frac{\sigma^2}{2\lambda}(1-e^{-\lambda|t|}). \] So, for \(t>0\), the semi-variogram increases from zero, asymptoting at the stationary variance of the process.

10.5 Isotropy

TODO

10.6 The nugget

TODO